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ABSTRACT 



The present invention relates to a method for purifying recombinant HCV single 
or specific oligomeric envelope proteins selected from the group consisting of El and/or 
E2 and/or E1/E2, characterized in that upon Iysing the transformed host cells to isolate 
the recombinantly expressed protein a disulphide bond cleavage or reduction step is 
carried out with a disulphide bond cleavage agent The present invention also relates to a 
composition isolated by such a method. The present invention also relates to the 
diagnostic and therapeutic application of these compositions. Furthermore, the invention 
relates to the use of HCV E 1 protein and peptides for proposing and monitoring the 
clinical effectiveness and/or clinical outcome of HCV treatment. 
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PURIFIED HEPATITIS C VIRUS ENVELOPE PROTEINS FOR DIAGNOSTTr 
AND THERAPEUTIC USE 

Field of the invention 

5 

The present invention relates to the general fields of recombinant protein 
expression, purification of recombinant proteins, synthetic peptides, diagnosis of HCV 
infection, prophylactic treatment against HCV infection and to the prognosis/monitoring of 
the clinical efficiency of treatment of an individual with chronic hepatitis, or the 

i o prognosis/monitoring of natural disease. 

More particularly, the present invention relates to purification methods for hepatitis 
C virus envelope proteins, the use in diagnosis, prophylaxis or therapy of HCV envelope 
proteins purified according to the methods described in the present invention, the use of 
single or specific oligomeric El and/or E2 and/or E1/E2 envelope proteins in assays for 

1 5 monitoring disease, and/or diagnosis of disease, and/or treatment of disease. The invention 
also relates to epitopes of the El and/or E2 envelope proteins and monoclonal antibodies 
thereto, as well their use in diagnosiSj prophylaxis or treatment. 

Background of the invention 

20 

The E2 protein purified from cell lysates according to the methods described in the 
present invention reacts with approximately 95% of patient sera. This reactivity is similar 
to the reactivity obtained with E2 secreted from CHO cells (Spaete et al., 1992). However, 
the intracellularly expressed form of E2 may more closely resemble the native viral 

25 envelope protein because it contains high mannpse carbohydrate motifs, whereas the E2 
protein secreted from CHO cells is further modified with galactose and sialic acid sugar 
moieties. When the ammoterrninal half of E2 is expressed in the baculovirus system, only 
about 13 to 2 1% of sera from several patient groups can be detected (Inoue et al., 1 992). 
After expression of E2 from E. coli, the reactivity of HCV sera was even lower and ranged 

30 from 14 (Yokosuka et al., 1992) to 1 7% (Mita et al., 1992). 
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About 75% of HCV sera (and 95% of chronic patients) are anti-El positive using 
the purified, vaccinia-expressed recombinant El protein of the present invention, in sharp 
contrast with the results of Kohara et al. (1992) and Hsu et al. (1993). Kohara et al. used a 
vaccinia- virus expressed El protein and detected anti-El antibodies in 7 to 23% of 
5 patients, while Hsu et al. only detected 14/50 (28%) sera using baculovirus-expressed El. 

These results show that not only a good expression system but also a good 
purification protocol are required to reach a high reactivity of the envelope proteins with 
human patient sera. This can be obtained using the proper expression system and/or 
purification protocols of the present invention which guarantee the conservation of the 
natural folding of the protein and the purification protocols , of the present invention which 
guarantee the elimination of contaminating proteins and which preserve the conformation, 
and thus the reactivity of the HCV envelope proteins. The amounts of purified HCV 
envelope protein needed for diagnostic screening assays are in the range of grams per year. 
For vaccine purposes, even higher amounts of envelope protein would be needed. 
Therefore, the vaccinia virus system may be used for selecting the best expression 
constructs and for limited upscaling, and large-scale expression and purification of single 
or specific oligomeric envelope proteins containing high-mannose carbohydrates may be 
achieved when expressed from several yeast strains. In the case of hepatitis B for example, 
manufacturing of HBsAg from mammalian cells was much more costly compared with 
yeast-derived hepatitis B vaccines. 

Aims of the -invention 

25 It is an aim of the present invention to provide a new purification method for 

recombinantly expressed El and/or E2 and/or E1/E2 proteins such that said recombinant 
proteins are directly usable for diagnostic and vaccine purposes as single or specific 
oligomeric recombinant proteins free from contaminants instead of aggregates. 

It is another aim of the present invention to provide compositions comprising 

30 purified (single or specific oligomeric) recombinant El and/or E2 and/or E1/E2 




glycoproteins comprising conformational epitopes from the El and/or E2 domains of 
HCV. 

It is yet another aim of the present invention to provide novel recombinant vector 
constructs for recombinantly expressing El and/or E2 and/or E1/E2 proteins, as well as 
host cells transformed with said vector constructs. 

It is also an aim of the present invention to provide a method for producing and 
purifying recombinant HCV El and/or E2 and/or E1/E2 proteins. 

It is also an aim of the present invention to provide diagnostic and immunogenic 
uses of the recombinant HCV El and/or E2 and/or E1/E2 proteins of the present invention, 
as well as to provide kits for diagnostic use, vaccines or therapeutics comprising any of the 
recombinant HCV El and/or E2 and/or E1/E2 proteins of the present invention. 

It is further an aim of the present invention to provide for a new use of El, E2, 
and/or E1/E2 proteins, or suitable parts thereof, for monitoring/prognosing the response to . 
treatment of patients (e.g. with interferon) suffering from HCV infection. 

It is also an aim of the present invention to provide for the use of the recombinant 
El, E2 , and/or E1/E2 proteins of the present invention in HCV screening and confirmatory 
antibody tests. 

It is also an aim of the present invention to provide El and/or E2 peptides which 
can be used for diagnosis of HCV infection and for raising antibodies. Such peptides may 
also be used to isolate human monoclonal antibodies. 

It is also an aim of the present invention to provide monoclonal antibodies, more 
particularly human monoclonal antibodies or mouse monoclonal antibodies which are 
humanized, which react specifically with El and/or E2 epitopes, either comprised in 
peptides or conformational epitopes comprised in recombinant proteins. 

It is also an aim of the present invention to provide possible uses of anti-El or anti- 
E2 monoclonal antibodies for HCV antigen detection or for therapy of chronic HCV 
infection. 

It is also an aim of the present invention to provide kits for monitoring/prognosing 
the response to treatment (e.g. with interferon) of patients suffering from HCV infection or 
monitoring/prognosing the outcome of the disease. 



• • » * • 
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All the aims of the present invention are considered to have been met by the 
embodiments as set out below. 

Definitions 

5 

The following definitions serve to illustrate the different terms and expressions 
used in the present invention. 

The term 'hepatitis C virus single envelope protein' refers to a polypeptide or an 
analogue thereof (e.g. mimotopes) comprising an amino acid sequence (and/or amino acid 

1 0 analogues) defining at least one HCV epitope of either the El or the E2 region. These 

single envelope proteins in the broad sense of the word may be both monomelic or homo- 
oligomeric forms of recombinantly expressed envelope proteins. Typically, the sequences 
defining the epitope correspond to the amino acid sequence of either the El or the E2 
region of HCV (either identically or via substitution of analogues of the native amino acid 

1 5 residue that do not destroy the epitope). In general, the epitope-defining sequence will be 3 
or more amino acids in length, more typically, 5 or more amino acids in length, more 
typically 8 or more amino acids in length, and even more typically 1 0 or more amino acids 
in length. With respect to conformational epitopes, the length of the epitope-defining 
sequence can be subject to wide variations, since it is believed that these epitopes are 

20 formed by the three-dimensional shape of the antigen (e.g. folding). Thus, the amino acids 
defining the epitope can be relatively few in number, but widely dispersed along the length 
of the molecule being brought into the correct epitope conformation via folding. The 
portions of the antigen between the residues defining the epitope may not be critical to the 
conformational structure of the epitope. For example, deletion or substitution of these 

25 intervening sequences may not affect the conformational epitope provided sequences 
critical to epitope conformation are maintained (e.g. cysteines involved in disulfide 
bonding, glycosylation sites, etc.). A conformational epitope may also be formed by 2 or 
more essential regions of subunits of a homooligomer or heterooligomer. 

The HCV antigens of the present invention comprise conformational epitopes from 

30 the El and/or E2 (envelope) domains of HCV. The El domain, which is believed to 

correspond to the viral envelope protein, is currently estimated to span amino acids 1 92- ^ 
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383 of the HCV polyprotein (Hijikata et al., 1991). Upon expression in a mammalian 
system (glycosylated), it is believed to have an approximate molecular weight of 35 kDa as 
determined via SDS-PAGE. The E2 protein, previously called NS1 , is believed to span 
amino acids 384-809 or 384-746 (Grakoui et al., 1993) of the HCV polyprotein and to also 
5 be an envelope protein. Upon expression in a vaccinia system (glycosylated), it is believed 
to have an apparent gel molecular weight of about 72 kDa. It is understood that these 
protein endpoints are approximations (e.g. the carboxy terminal .'end of E2 could lie 
somewhere in the 730-820 amino acid region, e.g. ending at amino acid 730, 735, 740, 
742, 744, 745, preferably 746, 747, 748, 750, 760, 770, 780, 790, 800, 809, 810, 820). The 
1 0 E2 protein may also be expressed together with the E 1 , P7 (aa 747-809), NS2 (aa 8 1 0- 
1026), NS4A (aa 1658-171 1) or NS4B (aa 1712-1972). Expression together with these 
other HCV proteins may be important for obtaining the correct protein folding. 

It is also understood that the isolates used in the examples section of the present 
invention were not intended to limit the scope of the invention and that any HCV isolate 
1 5 from type 1 , 2, 3, 4, 5, 6, 7, 8, 9, 1 0 or any other new genotype of HCV is a suitable source 
of El and/or E2 sequence for the practice of the present invention. 

The El and E2 antigens used in the present invention may be full-length viral 
proteins, substantially full-length versions thereof, or functional fragments thereof (e.g. 
fragments which are not missing sequence essential to the formation or retention of an 
20 epitope). Furthermore, the HCV antigens of the present invention can also include other 
sequences that do not block or prevent the formation of the conformational epitope of 
interest. The presence or absence of a conformational epitope can be readily determined 
though screening the antigen of interest with an antibody (polyclonal serum or monoclonal 
to the conformational epitope) and comparing its reactivity to that of a denatured version of 
25 the antigen which retains only linear epitopes (if any). In such screening using polyclonal 
antibodies, it may be advantageous to adsorb the polyclonal serum first with the denatured 
antigen and see if it retains antibodies to the antigen of interest. 

The HCV antigens of the present invention can be made by any recombinant 
method that provides the epitope of intrest For example, recombinant intracellular 
30 expression in mammalian or insect cells is a preferred method to provide glycosylated El 

and/or E2 antigens in 'native' conformation as is the case for the natural HCV antigens. "% 



Yeast cells and mutant yeast strains (e.g. mnn 9 mutant (Kniskem et al., 1994) or 
glycosylation mutants derived by means of vanadate resistence selection (Ballou et al., 
1991)) may be ideally suited for production of secreted high-mannose-type sugars; whereas 
proteins secreted from mammalian cells may contain modifications including galactose or 
5 sialic acids which may be undesirable for certain diagnostic or vaccine applications. 

However, it may also be possible and sufficient for certain applications, as it is known for 
proteins, to express the antigen in other recombinant hosts (such as E. coli) and renature 
the protein after recovery. 

The term 'fusion polypeptide' intends a polypeptide in which the HCV antigen(s) 
•*"•.: 10 are part of a single continuous chain of amino acids, which chain does not occur in nature. 

*• The HCV antigens may be connected directly to each other by peptide bonds or be 

• • • • • 

I separated by intervening amino acid sequences. The fusion polypeptides may also contain 

• • • 

• • 

amino acid sequences exogenous to HCV. 

The term 'solid phase' intends a solid body to which the individual HCV antigens or 

• • • * 

*• .* I 15 the fusion polypeptide comprised of HCV antigens are bound covalently or by noncovalent 

means such as hydrophobic adsorption. 

• • 

. . • The term 'biological sample' intends a fluid or tissue of a mammalian individual 

• • 

(e.g. an anthropoid, a human) that commonly contains antibodies produced by the 
•..**: individual, more particularly antibodies against HCV. The fluid or tissue may also contain 

• • • 

*• *•• 20 HCV antigen. Such components are known in the art and include, without limitation, 

blood, plasma, serum, urine, spinal fluid, lymph fluid, secretions of the respiratory, 
intestinal or genitourinary tracts, tears, saliva, milk, white blood cells and myelomas. Body 
components include biological liquids. The term 'biological liquid' refers to a fluid 
obtained from an organism. Some biological fluids are used as a source of other products, 
25 such as clotting factors (e.g. Factor VIII;C), serum albumin, growth hormone and the like. 
In such cases, it is important that the source of biological fluid be free of contamination by 
virus such as HCV. 

The term 'immunologically reactive' means that the antigen in question will react 
specifically with anti-HCV antibodies present in a body component from an HCV infected 
30 individual. 



The term 'immune complex' intends the combination formed when an antibody 
binds to an epitope on an antigen. 

'El' as used herein refers to a protein or polypeptide expressed within the first 400 
amino acids of an HCV polyprotein, sometimes referred to as the E, ENV or S protein. In 
its natural form it is a 3 5 kDa glycoprotein which is found in strong association with 
membranes. In most natural HCV strains, the El protein is encoded in the viral polyprotein 
following the C (core) protein. The El protein extends from approximately amino acid (aa) 
192 to about aa 383 of the full-length polyprotein. 

The term 'El ' as used herein also includes analogs and truncated forms that are 
immunologically cross-reactive with natural El, and includes El proteins of genotypes 1, 
2, 3, 4, 5, 6, 7, 8, 9, 10, or any other newly identified HCV type or subtype. 

'E2' as used herein refers to a protein or polypeptide expressed within the first 900 
amino acids of an HCV polyprotein, sometimes referred to as the NS1 protein. In its 
natural form it is a 72 kDa glycoprotein that is found in strong association with 
membranes. In most natural HCV strains, the E2 protein is encoded in the viral polyprotein 
following the El protein. The E2 protein extends from approximately amino acid position 
384 to amino acid position 746, another form of E2 extends to amino acid position 809. 
The term 'E2' as used herein also includes analogs and truncated forms that are 
immunologically cross-reactive with natural E2. For example, insertions of multiple 
codons between codon 383 and 384, as well as deletions of amino acids 384-387 have been 
reported by Kato et al. ( 1 992). 

'E1/E2' as used herein refers to an oligomeric form of envelope proteins containing 
at least one El component and at least one E2 component. 

The term 'specific oligomeric' El and/or E2 and/or E1/E2 envelope proteins refers 
to all possible oligomeric forms of recombinantly expressed El and/or E2 envelope 
proteins which are not aggregates. El and/or E2 specific oligomeric envelope proteins are 
also referred to as homo-oligomeric El or E2 envelope proteins (see below). 

The term 'single or specific oligomeric' El and/or E2 and/or E1/E2 envelope 
proteins refers to single monomelic El or E2 proteins (single in the strict sense of the 
word) as well as specific oligomeric El and/or E2 and/or E1/E2 recombinantly expressed 
proteins. These single or specific oligomeric envelope proteins according to the present 



invention can be further defined by the following formula (El) x (E2) y wherein x can be a 
number between 0 and 1 00, and y can be a number between o and 1 00, provided that x and 
y are not both 0. With x=l and y=0 said envelope proteins include monomelic El . 

The term 'homo-oligomer' as used herein refers to a complex of El and/or E2 
5 containing more than one El or E2 monomer, e.g. El/El dimers, El/El/El trimers or 

El/El/El/El tetramers and E2/E2 dimers, E2/E2/E2 trimers or E2/E2/E2/E2 tetramers, El 
pentamers and hexamers, E2 pentamers and hexamers or any higher-order homo-oligomers 
of El or E2 are all 'homo-oligomers' within the scope of this definition. The oligomers may 
contain one, two, or several different monomers of El or E2 obtained from different types 
10 or subtypes of hepatitis C virus including for example those described in an international 
application published under WO 94/25601 and European application No. 94870166.9 both 
by the present applicants. Such mixed oligomers are still homo-oligomers within the scope 
of this invention, and may allow more universal diagnosis, prophylaxis or treatment of 
HCV. 

1 5 The term 'purified' as applied to proteins herein refers to a composition wherein the 

desired protein comprises at least 35% of the total protein component in the composition. 
The desired protein preferably comprises at least 40%, more preferably at least about 50%, 
more preferably at least about 60%, still more preferably at least about 70%, even more 
preferably at least about 80%, even more preferably at least about 90%, and most 

20 preferably at least about 95% of the total protein component. The composition may contain 
other compounds such as carbohydrates, salts, lipids, solvents, and the like, withouth 
affecting the determination of the percentage purity as used herein. An 'isolated' HCV 
protein intends an HCV protein composition that is at least 35% pure. 

The term 'essentially purified proteins' refers to proteins purified such that they can 

25 be used for in vitro diagnostic methods and as a therapeutic compound. These proteins are 
substantially free from cellular proteins, vector^derived proteins or other HCV viral 
components. Usually these proteins are purified to homogeneity (at least 80% pure, 
preferably, 90%, more preferably 95%, more preferably 97%, more preferably 98%, more 
preferably 99%, even more preferably 99.5%, and most preferably the contarninating 

30 proteins should be undetectable by conventional methods like SDS-PAGE and silver 

staining. ^ 
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The term 'recombinantly expressed' used within the context of the present invention 
refers to the fact that the proteins of the present invention are produced by recombinant 
expression methods be it in prokaryotes, or lower or higher eukaryotes as discussed in 
detail below. 

5 The term 'lower eukaryote' refers to host cells such as yeast, fungi and the like. 

Lower eukaryotes are generally (but not necessarily) unicellular. Preferred lower 
eukaryotes are yeasts, particularly species within Saccharomyces, Schizosaccharomvces . 
Kluveromyces, Pichia (e.g. Pichia pastoris), Hansenula (e.g. Hansenula pojymorpha), 
Yarowia. Schwaniomyces, Schizosaccharomvces. Zygosaccharomvces and the like. 

10 Saccharomvces cerevisiae , S. carlsbergensis and K. lactis are the most commonly used 
yeast hosts, and are convenient fungal hosts. 

The term 'prokaryotes' refers to hosts such as E.coli. Lactobacillus, Lactococcus, 
Salmonella. Streptococcus. Bacillus subtilis or Streptomvces . Also these hosts are 
contemplated within the present invention. 

1 5 The term 'higher eukaryote' refers to host cells derived from higher animals, such as 

mammals, reptiles, insects, and the like. Presently preferred higher eukaryote host cells are 
derived from Chinese hamster (e.g. CHO), monkey (e.g. COS and Vero cells), baby 
hamster kidney (BHK), pig kidney (PK15), rabbit kidney 13 cells (RK.13), the human 
osteosarcoma cell line 143 B, the human cell line HeLa and human hepatoma cell lines like 

20 Hep G2, and insect cell lines (e.g. Spodoptera frugiperda) . The host cells may be provided 
in suspension or flask cultures, tissue cultures, organ cultures and the like. Alternatively the 
host cells may also be transgenic animals. 

The term 'polypeptide' refers to a polymer of amino acids and does not refer to a 
specific length of the product; thus, peptides, oligopeptides, and proteins are included 

25 within the definition of polypeptide. This term also does not refer to or exclude post- 
expression modifications of the polypeptide, for example, glycosylations, acetylations, 
phosphorylations and the like. Included within the definition are, for example, polypeptides 
containing one or more analogues of an amino acid (including, for example, unnatural 
amino acids, PNA, etc.), polypeptides with substituted linkages, as well as other 

30 modifications known in the art, both naturally occurring and non-naturally occurring. 
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The term 'recombinant polynucleotide or nucleic acid' intends a polynucleotide or 
nucleic acid of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of its 
origin or manipulation : (1) is not associated with all or a portion of a polynucleotide with 
which it is associated in nature, (2) is linked to a polynucleotide other than that to which it 
is linked in nature, or (3) does not occur in nature. 

The term 'recombinant host cells', 'host cells', 'cells', 'cell lines', 'cell cultures', and 
other such terms denoting microorganisms or higher eukaryotic cell lines cultured as 
unicellular entities refer to cells which can be or have been, used as recipients for a 
recombinant vector or other transfer polynucleotide, and include the progeny Of the 
original cell which has been transfected. It is understood that the progeny of a single 
parental cell may not necessarily be completely identical in morphology or in genomic or 
total DNA complement as the original parent, due to natural, accidental, or deliberate 
mutation. 

The term 'replicon' is any genetic element, e.g., a plasmid, a chromosome, a virus, a 
cosmid, etc., that behaves as an autonomous unit of polynucleotide replication within a 
cell; Le., capable of replication under its own control. 

The term 'vector' is a replicon further comprising sequences providing replication 
and/or expression of a desired open reading frame. 

The term 'control sequence' refers to polynucleotide sequences which are necessary 
to effect the expression of coding sequences to which they are ligated. The nature of such 
control sequences differs depending upon the host organism; in prokaryotes, such control 
sequences generally include promoter, ribosomal binding site, and terminators; in 
eukaryotes, generally, such control sequences include promoters, terminators and, in some 
instances, enhancers. The term 'control sequences' is intended to include, at a minimum, all 
components whose presence is necessary for expression, and may also include additional 
components whose presence is advantageous, for example, leader sequences which govern 
secretion. 

The term 'promoter' is a nucleotide sequence which is comprised of consensus 
sequences which allow the binding of RNA polymerase to the DNA template in a manner 
such that mRNA production initiates at the normal transcription initiation site for the 
adjacent structural gene. 
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The expression 'operably linked' refers to a juxtaposition wherein the components 
so described are in a relationship permitting them to function in their intended manner. A 
control sequence 'operably linked' to a coding sequence is ligated in such a way that 
expression of the coding sequence is achieved under conditions compatible with the 
5 control sequences. 

An 'open reading frame' (ORF) is a region of a polynucleotide sequence which 
encodes a polypeptide and does not contain stop codons; this region may represent a 
portion of a coding sequence or a total coding sequence. 

A 'coding sequence' is a polynucleotide sequence which is transcribed into mRNA 

1 0 and/or translated into a polypeptide when placed under the control of appropriate 
regulatory sequences. The boundaries of the coding sequence are determined by a 
translation start codon at the 5'-terminus and a translation stop codon at the 3 '-terminus. A 
coding sequence can include but is not limited to mRNA, DNA (including cDNA), and 
recombinant polynucleotide sequences. 

1 5 As used herein, 'epitope' or 'antigenic determinant' means an amino acid sequence 

that is immunoreactive. Generally an epitope consists of at least 3 to 4 amino acids, and 
more usually , consists of at least 5 or 6 amino acids, sometimes the epitope consists of 
about 7 to 8, or even about 1 0 amino acids. As used herein, an epitope of a designated 
polypeptide denotes epitopes with the same amino acid sequence as the epitope in the 

20 designated polypeptide, and immunologic equivalents thereof. Such equivalents also 

include strain, subtype (=genotype), or type(group)-specific variants, e.g. of the currently 
known sequences or strains belonging to genotypes la, lb, lc, Id, le, If, 2a, 2b, 2c, 2d, 2e, 
2f, 2g, 2h, 2i, 3a, 3b, 3c, 3d, 3e, 3f, 3g, 4a, 4b, 4c, 4d, 4e, 4f, 4g, 4h, 4i, 4j, 4k, 41, 5a, 5b, 
6a, 6b, 6c, 7a, 7b, 7c, 8a, 8b, 9a, 9b, 10a, or any other newly defined HCV (sub)type. It is 

25 to be understood that the amino acids constituting the epitope heed not be part of a linear 
sequence, but may be interspersed by any number of amino acids, thus fonning a 
conformational epitope. 

The term 'immunogenic' refers to the ability of a substance to cause a humoral 
and/or cellular response, whether alone or when linked to a carrier, in the presence or 

30 absence of an adjuvant. 'Neutralization' refers to an immune response that blocks the 

infectivity, either partially or fully, of an infectious agent. A 'vaccine' is an immunogenic 
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composition capable of eliciting protection against HCV, whether partial or complete. A 
vaccine may also be useful for treatment of an individual, in which case it is called a 
therapeutic vaccine. 

The term 'therapeutic' refers to a composition capable of treating HCV infection. 

The term 'effective amount' refers to an amount of epitope-bearing polypeptide 
sufficient to induce an immunogenic response in the individual to which it is administered, 
or to otherwise detectably immunoreact in its intended system (e.g., immunoassay). 
Preferably, the effective amount is sufficient to effect treatment, as defined above. The 
exact amount necessary will vary according to the application. For vaccine applications or 
for the generation of polyclonal antiserum / antibodies, for example, the effective amount 
may vary depending on the species, age, and general condition of the individual, the 
severity of the condition being treated, the particular polypeptide selected and its mode of 
administration, etc. It is also believed that effective amounts will be found within a 
relatively large, non-critical range. An appropriate effective amount can be readily 
determined using only routine experimentation. Preferred ranges of El and/or E2 and/or 
E1/E2 single or specific oligomeric envelope proteins for prophylaxis of HCV disease are 
0.01 to 1 00 ug/dose, preferably 0.1 to 50 jag/dose. Several doses may be needed per 
individual in order to achieve a sufficient immune response and subsequent protection 
against HCV disease. 



Detailed description of the invention 

More particularly, the present invention contemplates a method for isolating or 
purifying recombinant HCV single or specific oligomeric envelope protein selected from 
the group consisting of El and/or E2 and/or E1/E2, characterized in that upon lysing the 
transformed host cells to isolate the recombinantly expressed protein a disulphide bond 
cleavage or reduction step is carried out with a disculphide bond cleaving agent. 

The essence of these 'single or specific oligomeric' envelope proteins of the 
invention is that they are free from contaminating proteins and that they are not disulphide 
bond linked with contaminants. 
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The proteins according to the present invention are recombinantly expressed in 
lower or higher eukaryotic cells or in prokaryotes. The recombinant proteins of the present 
invention are preferably glycosylated and may contain high-mannose-type, hybrid, or 
complex glycosylations. Preferentially said proteins are expressed from mammalian cell 
lines as discussed in detail in the Examples section, or in yeast such as in mutant yeast 
strains also as detailed in the Examples section. 

The proteins according to the present invention may be secreted or expressed 
within components of the cell, such as the ER or the Golgi Apparatus. Preferably, however, 
the proteins of the present invention bear high-mannose-type glycosylations and are 
retained in the ER or Golgi Apparatus of mammalian cells or are retained in or secreted 
from yeast cells, preferably secreted from yeast mutant strains such as the mnn9 mutant 
(Kniskern et al., 1994), or from mutants that have been selected by means of vanadate 
resistence (Ballou et al., 1991). 

Upon expression of HCV envelope proteins, the present inventors could show that 
some of the free thiol groups of cysteines not involved in intra- or inter-molecular 
disulphide bridges, react with cysteines of host or expression-system-derived (e.g. 
vaccinia) proteins or of other HCV envelope proteins (single or oligomeric), and form 
aspecific intermolecular bridges. This results in the formation of 'aggregates' of HCV 
envelope proteins together with contaminating proteins. It was also shown in WO- 
92/08734 that 'aggregates' were obtained after purification, but it was not described which 
protein interactions were involved. In patent application WO 92/08734, recombinant E1/E2 
protein expressed with the vaccinia virus system were partially purified as aggregates and 
only found to be 70% pure, rendering the purified aggregates not useful for diagnostic, 
prophylactic or therapeutic purposes. 

Therefore, a major aim of the present invention resides in the separation of single 
or specific-oligomeric HCV. envelope proteins from contaminating proteins, and to use the 
purified proteins (> 95% pure) for diagnostic, prophylactic and therapeutic purposes. To 
those purposes, the present inventors have been able to provide evidence that aggregated 
protein complexes ('aggregates') are formed on the basis of disulphide bridges and non- 
covalent protein-protein interactions. The present invention thus provides a means for 
selectively cleaving the disulphide bonds under specific conditions and for separating the 
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cleaved proteins from contaminating proteins which greatly interfere with diagnostic, 
prophylactic and therapeutic applications. The free thiol groups may be blocked (reversibly 
or irreversibly) in order to prevent the reformation of disulphide bridges, or may be left to 
oxidize and oligomerize with other envelope proteins (see definition homo-oligomer). It is 
5 to be understood that such protein oligomers are essentially different from the 'aggregates' 
described in WO 92/08734 and WO 94/01778, since the level of contaminating proteins is 
undetectable. 

Said disuphide bond cleavage may also be achieved by: 

( 1) performic acid oxidation by means of cysteic acid in which case the cysteine residues 
10 are modified into cysteic acid (Moore et al., 1963). 

(2) Sulfitolysis (R-S-S-R -> 2 R-SO" 3 ) for example by means of sulphite (S0 2 " 3 ) together 
with a proper oxidant such as Cu 2+ in which case the cysteine is modified into S-sulpho- 
cysteine (Bailey and Cole, 1959). 

(3) Reduction by means of mercaptans, such as dithiotreitol (DDT), B-mercapto-ethanol, 

1 5 cysteine, glutathione Red, s-mercapto-emylarnine, or thioglycollic acid, of which DTT and 
B-mercapto-ethanol are commonly used (Cleland, 1964), is the preferred method of this 
invention because the method can be performed in a water environment and because the 
cysteine remains unmodified. 

(4) Reduction by means of a phosphine (e.g. Bu 3 P) (Ruegg and Rudinger, 1977). 

20 All these compounds are thus to be regarded as agents or means for cleaving 

disulphide bonds according to the present invention. 

Said disulphide bond cleavage (or reducing) step of the present invention is 
preferably a partial disulphide bond cleavage (reducing) step (carried out under partial 
cleavage or reducing conditions). 

25 A preferred disulphide bond cleavage or reducing agent according to the present 

invention is dithiothreitol (DTT). Partial reduction is obtained by using a low concentration 
of said reducing agent, i.e. for DTT for example in the concentration range of about 0. 1 to 
about 50 mM, preferably about 0.1 to about 20 mM, preferably about 0.5 to about 10 mM, 
preferably more than 1 mM, more than 2 mM or more than 5 mM, more preferably about 

30 1 .5 mM, about 2.0 mM, about 2.5 mM, about 5 mM or about 7.5 mM. 
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Said disulphide bond cleavage step may also be carried out in the presence of a 
suitable detergent (as an example of a means for cleaving disulphide bonds or in 
combination with a cleaving agent) able to dissociate the expressed proteins, such as 
DecylPEG, EMPIGEN-BB, NP-40, sodium cholate, Triton X-l 00. 

Said reduction or cleavage step (preferably a partial reduction or cleavage step) is 
carried out preferably in in the presence of (with) a detergent. A preferred detergent 
according to the present invention is Empigen-BB. The amount of detergent used is 
preferably in the range of 1 to 10 %, preferably more than 3%, more preferably about 3.5% 
of a detergent such as Empigen-BB . 

A particularly preferred method for obtaining disulphide bond cleavage employs a 
combination of a classical disulphide bond cleavage agent as detailed above arid a 
detergent (also as detailed above). As contemplated in the Examples section, the particular 
combination of a low concentration of DTT (1 .5 to 7.5 mM) and about 3.5 % of Empigen- 
BB is proven to be a particularly preferred combination of reducing agent and detergent for 
the purification of recombinantly expressed El and E2 proteins. Upon gelfiltration 
chromatography, said partial reduction is shown to result in the production of possibly 
dimeric El protein and reparation of this-ETprotein from contarninating proteins that cause - 
false reactivity upon use in immunoassays. 

It is, however, to be understood that also any other combination of any reducing 
agent known in the art with any detergent or other means known in the art to make the 
cysteines better accessible is also within the scope of the present invention, insofar as said 
combination reaches the same goal of disulphide bridge cleavage as the preferred 
combination examplified in the present invention. 

Apart from reducing the disulphide bonds, a disulphide bond cleaving means 
according to the present invention may also include any disulphide bridge exchanging 
agents (competitive agent being either organic or proteinaeous, see for instance Creighton, 
1988) known in the art which allows the following type of reaction to occur: 
Rl S - S R2 + R3 SH -> Rl S - S R3 + R2 SH 

* R 1 , R2: compounds of protein aggregates 

* R3 SH: competitive agent (organic, proteinaeous) 
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The term 'disulphide bridge exchanging agent 1 is to be interpretated as including 
disulphide bond reforming as well as disulphide bond blocking agents. 

The present invention also relates to methods for purifying or isolating HCV single 
or specific oligomeric envelelope proteins as set out above further including the use of any 
SH group blocking or binding reagent known in the art such as chosen from the following 
list: 

Glutathion 

5,5'-dithiobis-(2-nitrobenzoic acid) or bis-(3-carboxy-4-nitrophenyl)-disulphide 
(DTNB or Ellman's reagent) (Elmann, 1959) 
N-ethylmaleimide (NEM; Benesch et al., 1956) 

N-(4-dimethylamino-3,5-dinitrophenyl) maleirnide or Tuppy's maleimide which 
provides a color to the protein 
P-chloromercuribenzoate (Grassetti et al., 1969) 

4- vinyl pyridine (Friedman and Krull, 1969) can be liberated after reaction by acid 
hydrolysis 

acrylonitrile, can be liberated after reaction by acid hydrolysis (Weil and Seibles, 
1961) 

NEM-biotin (e.g. obtained from Sigma B 1267) 
2,2'-dithiopyridine (Grassetti and Murray, 1 967) 
4,4'-dithiopyridine (Grassetti and Murray, 1 967) 
6,6'-ditModinicontinic acid (DTDNA; Brown and Cunnigham, 1 970) 
2,2'-dithiobis-(5 , -nitropyridine) (DTNP; US patent 35971 60) or other dithiobis 
(heterocyclic derivative) compounds (Grassetti and Murray, 1969) 
A survey of the publications cited shows that often different reagents for sulphydryl 
groups will react with varying numbers of thiol groups of the same protein or enzyme 
molecule. One may conclude that this variation in reactivity of the thiol groups is due to 
the steric environment of these groups, such as the shape of the molecule and the 
surrounding groups of atoms and their charges, as well as to the size, shape and charge of 
the reagent molecule or ion. Frequently the presence of adequate concentrations of 
denaturants such as sodium dodecyl sulfate, urea or guanidine hydrochloride will cause 
sufficient unfolding of the protein molecule to permit equal access to all of the reagents for 
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thiol groups. By varying the concentration of denaturant, the degree of unfolding can be 
controlled and in this way thiol groups with different degrees of reactivity may be revealed. 
Although up to date most of the work reported has been done with p- 
chloromercuribenzoate, N-ethylmaleimide and DTNB, it is likely that the other more 
recently developed reagents may prove equally useful. Because of their varying structures, 
it seems likely, in fact, that they may respond differently to changes in the steric 
environment of the thiol groups. 

Alternatively, conditions such as low pH (preferably lower than pH 6) for 
preventing free SH groups from oxidizing and thus preventing the formation of large 
intermolecular aggregates upon recombinant expression and purification of El and E2 
(envelope) proteins are also within the scope of the present invention. 

A preferred SH group blocking reagent according to the present invention is N- 
ethylmaleimide (NEM). Said SH group blocking reagent may be adniinistrated during lysis 
of the recombinant host cells and after the above-mentioned partial reduction process or 
after any other process for cleaving disulphide bridges. Said SH group blocking reagent 
may also be modified with any group capable of providing a detectable label and/or any 

group aiding in the immobilization of said recombinant protein tera-solid substrate r ergr 

biotinylated NEM. 

Methods for cleaving cysteine bridges and blocking free cysteines have also been 
described in Darbre (1987), Means and Feeney (1971), and by Wong (1993). 

A method to purify single or specific oligomeric recombinant El and/or E2 and/or 
E1/E2 proteins according to the present invention as defined above is further characterized 
as comprising the following steps: 

lysing recombinant E 1 and/or E2 and/or E l fE2 expressing host cells, preferably in 
the presence of an SH group blocking agent, such as N-ethylmaleimide (NEM), 
and possibly a suitable detergent, preferably Empigen-BB, 
recovering said HCV envelope protein by affinity purification for instance by 
means lectin-chromatography, such as lentil-lectin chromatography, or 
immunoaffinity chromatography using anti-El and/or anti-E2 specific monoclonal 
antibodies, followed by, 
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reduction or cleavage of disulphide bonds with a disulphide bond cleaving agent, 
such as DTT, preferably also in the presence of an SH group blocking agent, such 
as NEM or Biotin-NEM, and, 

recovering the reduced HCV El and/or E2 and/or E1/E2 envelope proteins for 

instance by gelfiltration (size exclusion chromatography or molecular sieving) and 

possibly also by an additional Ni 2+ -IMAC chromatography and desalting step. 

It is to be understood that the above-mentioned recovery steps may also be carried 
out using any other suitable technique known by the person skilled in the art. 

Preferred lectin-chromatography systems include Galanthus nivalis agglutinin 
(GNA) - chromatography, or Lens culinaris agglutinin (LCA) (lentil) lectin 
chromatography as illustrated in the Examples section. Other useful lectins include those 
recognizing high-mannose type sugars, such as Narcissus pseudonarcissus agglutinin 
(NPA), Pisum sativum agglutinin (PSA), or Allium ursinum agglutinin (AUA). 

Preferably said method is usable to purify single or specific oligomeric HCV 
envelope protein produced intracellularly as detailed above. 

For secreted El or E2 or E1/E2 oligomers, lectins binding complex sugars such as 
Ricinus communis agglutinin I (RCA I), are preferred lectins. 

The present invention more particularly contemplates essentially purified 
recombinant HCV single or specific oligomeric envelope proteins, selected from the group 
consisting of El and/or E2 and/or E1/E2, characterized as being isolated or purified by a 
method as defined above. 

The present invention more particularly relates to the purification or isolation of 
recombinant envelope proteins which are expressed from recombinant mammalian cells 
such as vaccinia. 

The present invention also relates to the purification or isolation of recombinant 
envelope proteins which are expressed from recombinant yeast cells. 

The present invention equally relates to the purification or isolation of recombinant 
envelope proteins which are expressed from recombinant bacterial (prokaryotic) cells. 

The present invention also contemplates a recombinant vector comprising a vector 
sequence, an appropriate prokaryotic, eukaryotic or viral or synthetic promoter sequence 
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foilowed by a nucleotide sequence allowing the expression of the single or specific 
oligomeric El and/or E2 and/or E1/E2 of the invention. 

Particularly, the present invention contemplates a recombinant vector comprising a 
vector sequence, an appropriate prokaryotic, eukaryotic or viral or synthetic promoter 
sequence followed by a nucleotide sequence allowing the expression of the single El or El 
of the invention. 

Particularly, the present invention contemplates a recombinant vector comprising a 
vector sequence, an appropriate prokaryotic, eukaryotic or viral or synthetic promoter 
sequence followed by a nucleotide sequence allowing the expression of the single El or E2 
of the invention. 

The segment of the HCV cDNA encoding the desired El and/or E2 sequence 
inserted into the vector sequence may be attached to a signal sequence. Said signal 
sequence may be that from a non-HCV source, e.g. the IgG or tissue plasminogen activator 
(tpa) leader sequence for expression in mammalian cells, or the a-mating factor sequence 
for expression into yeast cells, but particularly preferred constructs according to the present 
invention contain signal sequences appearing in the HCV genome before the respective 

start points of the El and E2 proteins. The segment of the HCV cDNA encoding the 

desired El and/or E2 sequence inserted into the vector may also include deletions e.g. of 
the hydrophobic domain(s) as illustrated in the examples section, or of the E2 
hypervariable region I. 

More particularly, the recombinant vectors according to the present invention 
encompass a nucleic acid having an HCV cDNA segment encoding the polyprotein 
starting in the region between amino acid positions 1 and 1 92 and ending in the region 
between positions 250 and 400 of the HCV polyprotein, more preferably ending in the 
region between positions 250 and 341, even more preferably ending in the region between 
positions 290 and 341 for expression of the HCV single El protein. Most preferably, the 
present recombinant vector encompasses a recombinant nucleic acid having a HCV cDNA 
seqment encoding part of the HCV polyprotein starting in the region between positions 117 
and 192, and ending at any position in the region between positions 263 and 326, for 
expression of HCV single El protein. Also within the scope of the present invention are 
forms that have the first hydrophobic domain deleted (positions 264 to 293 plus or minus 8 
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amino acids), or forms to which a S'-terminal ATG codon and a 3'-terminal stop codon has 
been added, or forms which have a factor Xa cleavage site and/or 3 to 10, preferably 6 
Histidine codons have been added. 

More particularly, the recombinant vectors according to the present invention 
encompass a nucleic acid having an HCV cDNA segment encoding the polyprotein 
starting in the region between amino acid positions 290 and 406 and ending in the region 
between positions 600 and 820 of the HCV polyprotein, more preferably starting in the 
region between positions 322 and 406, even more preferably starting in the region between 
positions 347 and 406, even still more preferably starting in the region between positions 
364 and 406 for expression of the HCV single E2 protein. Most preferably, the present 
recombinant vector encompasses a recombinant nucleic acid having a HCV cDNA 
seqment encoding the polyprotein starting in the region between positions 290 and 406, 
and ending at any position of positions 623, 650, 661, 673, 710, 715, 720, 746 or 809, for 
expression of HCV single E2 protein. Also within the scope of the present invention are 
forms to which a 5'-terminal ATG codon and a 3'-tenninal stop codon has been added, or 
forms which have a factor Xa cleavage site and/or 3 to 10, preferably 6' Histidine codons 
have been added. 

A variety of vectors may be used to obtain recombinant expression of HCV single 
or specific oligomeric envelope proteins of the present invention. Lower eukaryotes such 
as yeasts and glycosylation mutant strains are typically transformed with plasmids, or are 
transformed with a recombinant virus. The vectors may replicate within the host 
independently, or may integrate into the host cell genome. 

Higher eukaryotes may be transformed with vectors, or may be infected with a 
recombinant virus, for example a recombinant vaccinia virus. Techniques and vectors for 
the insertion of foreign DNA into vaccinia virus are well known in the art, and utilize, for 
example homologous recombination. A wide variety of viral promoter sequences, possibly 
terminator sequences and poly(A)-addition sequences, possibly enhancer sequences and 
possibly amplification sequences, all required for the mammalian expression, are available 
m the art. Vaccinia is particularly preferred since vaccinia halts the expression of host cell 
proteins. Vaccinia is also very much preferred since it allows the. expression of El and E2 
proteins of HCV in cells or individuals which are immunized with the live recombinant 
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vaccinia virus. For vaccination of humans the avipox and Ankara Modified Virus (AMV) 
are particularly useful vectors. 

Also known are insect expression transfer vectors derived from baculovirus 
Autographa californica nuclear polyhedrosis virus (AcNPV), which is a helper- 
independent viral expression vector. Expression vectors derived from this system usually 
use the strong viral polyhedrin gene promoter to drive the expression of heterologous 
genes. Different vectors as well as methods for the introduction of heterologous DNA into 
the desired site of baculovirus are available to the man skilled in the art for baculovirus 
expression. Also different signals for posttranslational modification recognized by insect 
cells are known in the art. 

Also included within the scope of the present invention is a method for producing 
purified recombinant single or specific oligomeric HCV El or E2 or E1/E2 proteins, 
wherein the cysteine residues involved in aggregates formation are replaced at the level of 
the nucleic acid sequence by other residues such that aggregate formation is prevented. 
The recombinant proteins expressed by recombinant vectors caarying such a mutated El 
and/or E2 protein encoding nucleic acid are also within the scope of the present invention. 

Thepresent inventionalsorelates tor ecombmanf El "and/or E2 and/or E17E2 
proteins characterized in that at least one of their glycosylation sites has been removed and 
are consequently termed glycosylation mutants. As explained in the Examples section, 
different glycosylation mutants may be desired to diagnose (screening, confirmation, 
prognosis, etc.) and prevent HCV disease according to the patient in question. An E2 
protein glycosylation mutant lacking the GLY4 has for instance been found to improve the 
reactivity of certain sera in diagnosis. These glycosylation mutants are preferably purified 
according to the method disclosed in the present invention. Also contemplated within the 
present invention are recombinant vectors carrying the nucleic acid insert encoding such a 
El and/or E2 and/or E1/E2 glycosylation mutant as well as host cells tranforrned with such 
a recombinant vector. 

The present invention also relates to recombinant vectors including a 
polynucleotide which also forms part of the present invention. The present invention 
relates more particularly to the recombinant nucleic acids as represented in SEQ ID NO 3, 
5, 7, 9, 1 1, 13, 21, 23, 25, 27, 29, 31, 35, 37, 39, 41, 43, 45, 47 and 49, or parts thereof. 
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The present invention also contemplates host cells transformed with a recombinant 
vector as defined above, wherein said vector comprises a nucleotide sequence encoding 
HCV El and/or E2 and/or E1/E2 protein as defined above in addition to a regulatory 
sequence operably linked to said HCV El and/or E2 and/or E1/E2 sequence and capable of 
regulating the expression of said HCV El and/or E2 and/or E1/E2 protein. 

Eukaryotic hosts include lower and higher eukaryotic hosts as described in the 
definitions section. Lower eukaryotic hosts include yeast cells well known in the art. 
Higher eukaryotic hosts mainly include mammalian cell lines known in the art and include 
many immortalized cell lines available from the ATCC, inluding HeLa cells, Chinese 
hamster ovary (GHO) cells, Baby hamster kidney (BHK) cells, PK15, RK13 and a number 
of other cell lines. 

The present invention relates particularly to a recombinant El and/or E2 and/or 
E1/E2 protein expressed by a host cell as defined above containing a recombinany vector 
as defined above. These recombinant proteins are particularly purified according to the 
method of the present invention. 

A preferred method for isolating or purifying HCV envelope proteins as defined 
above is further characterized as comprising at least the following steps: 

growing a host cell as defined above transformed with a recombinant vector 
. according to the present invention or with a known recombinant vector expressing 

El and/or E2 and/or E1/E2 HCV envelope proteins in a suitable culture medium, 

causing expression of said vector sequence as defined above under suitable 

conditions, and, 

lysing said transformed host cells, preferably in the presence of a SH group 
blocking agent, such as N-ethylmaleimide (NEM), and possibly a suitable 
detergent, preferably Empigen-BB, 

recovering said HCV envelope protein by affinity purification such as by means of 
lectin-chromatography or immunoaffinity chromatography using anti-El and/or 
anti-E2 specific monoclonal antibodies, with said lectin being preferably lentil- 
lectin or GNA, followed by, 
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incubation of the eluate of the previous step with a disulphide bond cleavage 
means, such as DTT, preferably followed by incubation with an SH group blocking 
agent, such as NEM or Biotin-NEM, and, 

isolating the HCV single or specific oligomeric El and/or E2 and/or E1/E2 proteins 
such as by means of gelfiltration and possibly also by a subsequent Ni 2+ -IMAC 
chromatography followed by a desalting step. 

As a result of the above-mentioned proces, El and/or E2 and/or E1/E2 proteins 
may be produced in a form which elute differently from die large aggregates containing 
vector-derived components and/or cell components in the void volume of the gelfiltration 
column or the IMAC collumn as illustrated in the Examples section. The disulphide bridge 
cleavage step advantageously also eliminates the false reactivity due to the presence of host 
and/or expression-system-derived proteins. The presence of NEM and a suitable detergent 
during ly sis of the cells may already partly or even completely prevent the aggregation 
between the HCV envelope proteins and contaminants. 

Ni 2+ -IMAC chromatography followed by a desalting step is preferably used for 
contracts bearing a (His),, as described by Janknecht et al., 1991, and Hochuli et al„ 1 988. 

The present invention also relates to a method for producing monoclonal antibodies 
in small animals such as mice or rats, as well as a method for screening and isolating 
human B-cells that recognize anti-HCV antibodies, using.the HCV single or specific 
oligomeric envelope proteins of the present invention. 

The present invention further relates to a composition comprising at least one of the 
following El peptides as listed in Table 3: 

El -3 1 (SEQ ED NO 56) spanning amino acids 181 to 200 of the Core/El VI 

region, 

El-33 (SEQ ID NO 57) spanning amino acids 193 to 212 of the El region, 
El -35 (SEQ ID NO 58) spanning amino acids 205 to 224 of the El V2 region 
(epitope B), 

El -35 A (SEQ ID NO 59) spanning amino acids 208 to 227 of the El V2 region 
(epitope B), 

IbEl (SEQ ID NO 53) spanning amino acids 192 to 228 of El regions (VI, CI, 
and V2 regions (containing epitope B)), 
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El -51 (SEQ ID NO 66) spanning amino acids 301 to 320 of the El region, 
El-53 (SEQ ID NO 67) spanning amino acids 313 to 332 of the El C4 region 
(epitope A), 

El-55 (SEQ ID NO 68) spanning amino acids 325 to 344 of the El region. 
The present invention also relates to, a composition comprising at least one of the 
following E2 peptides as listed in Table 3: 

Env 67 or E2-67 (SEQ ID NO 72) spanning amino acid positions 397 to 416 of the 
E2 region (epitope A, recognized by monoclonal antibody 2F10H10, see Figure 
19), 

Env 69 or E2-69 (SEQ ID NO 73) spanning amino acid positions 409 to 428 of the 
E2 region (epitope A), 

Env 23 or E2-23 (SEQ ID NO 86) spanning positions 583 to 602 of the E2 region 
(epitope E), 

Env 25 or E2-25 (SEQ ID NO 87) spanning positions 595 to 614 of the E2 region 
(epitope E), 

Env 27 or E2-27 (SEQ ID NO 88) spanning positions 607 to 626 of the E2 region 
(epitope E), 

Env 17B or E2-1 7B (SEQ ID NO 83) spanning positions 547 to 566 of the E2 
region (epitope D), 

Env 13B or E2-1 3B (SEQ ID NO 82) spanning positions 523 to 542 of the E2 
region (epitope C; recognized by monoclonal antibody 16A6E7, see Figure 19). 
The present invention also relates to a composition comprising at least one of the 
following E2 conformational epitopes: 

epitope F recognized by monoclonal antibodies 1 5C8C1 , 12D1 IF 1 and 
8G10D1H9, 

epitope G recognized by monoclonal antibody 9G3E6, 

epitope H (or C) recognized by monoclonal antibody 10D3C4 and 4H6B2, or, 
epitope I recognized by monoclonal antibody 17F2C2. 

The present invention also relates to an El or E2 specific antibody raised upon 
immunization with a peptide or protein composition, with said antibody being specifically 
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reactive with any of the polypeptides or peptides as defined above, and with said antibody 

being preferably a monoclonal antibody. 

The present invention also relates to an El or E2 specific antibody screened from a 

variable chain library in plasmids or phages or from a population of human B -cells by 
5 means of a process known in the art, with said antibody being reactive with any of the 

polypeptides or peptides as defined above, and with said antibody being preferably a 

monoclonal antibody. 

The El or E2 specific monoclonal antibodies of the invention can be produced by 

any hybridoma liable to be formed according to classical methods from splenic cells of an 
1 0 animal, particularly from a mouse or rat, immunized against the HCV polypeptides or 

peptides according to the invention, as defined above on the one hand, and of cells of a 

myeloma cell line on the other hand, and to be selected by the ability of the hybridoma to 

produce the monoclonal antibodies recognizing the polypeptides which has been initially 

used for the immunization of the animals. 
1 5 The antibodies involved in the invention can be labelled by an appropriate label of 

the enzymatic, fluorescent, or radioactive type. 

The monoclonal antibodies according to this preferred embodiment of the 

invention may be humanized versions of mouse monoclonal antibodies made by means of 

recombinant DNA technology, departing from parts of mouse and/or human genomic 
20 DNA sequences coding for H and L chains from cDNA or genomic clones coding for H • 

and L chains. 

Alternatively the monoclonal antibodies according to this preferred embodiment of 
the invention may be human monoclonal antibodies. These antibodies according to the 
present embodiment of the invention can also be derived from human peripheral blood 

25 lymphocytes of patients infected with HCV, or vaccinated against HCV. Such human 
monoclonal antibodies are prepared, for instance, by means of human peripheral blood 
lymphocytes (PBL) repopulation of severe combined immune deficiency (SCID) mice (for 
recent review, see Duchosal et al., 1992). 

The invention also relates to the use of the proteins or peptides of the invention, for 

30 the selection of recombinant antibodies by the process of repertoire cloning (Persson et al., 
1991). 
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Antibodies directed to peptides or single or specific oligomeric envelope proteins 
derived from a certain genotype may be used as a medicament, more particularly for 
incorporation into an immunoassay for the detection of HCV genotypes (for detecting the 
presence of HCV El or E2 antigen), for prognosing/monitoring of HCV disease, or as 
therapeutic agents. 

Alternatively, the present invention also relates to the use of any of the above- 
specified El or E2 specific monoclonal antibodies for the preparation of an immunoassay 
kit for detecting the presence of El or E2 antigen in a biological sample, for the preparation 
of a kit for prognosing/monitoring of HCV disease or for the preparation of a HCV 
medicament. 

The present invention also relates to the a method for in vitro diagnosis or detection 
of HCV antigen present in a biological sample, comprising at least the following steps : 

(i) contacting said biological sample with any of the E 1 and/or E2 specific 
monoclonal antibodies as defined above, preferably in an immobilized form 
under appropriate conditions which allow the formation of an immune 
complex, 

(ii) removing unbound components, 

(iii) incubating the immune complexes formed with heterologous antibodies, 
which specifically bind to the antibodies present in the sample to be 
analyzed, with said heterologous antibodies having conjugated to a 
detectable label under appropriate conditions, 

(iv) detecting the presence of said immune complexes visually or mechanically 
(e.g. by means of densitometry, fluorimetry, colorimetry). 

The present invention also relates to a kit for in vitro diagnosis of HCV antigen 
present in a biological sample, comprising: 

at least one monoclonal antibody as defined above, with said antibody 
being preferentially immobilized on a solid substrate, 

a buffer or components necessary for producing the buffer enabling binding 
reaction between these antibodies and the HCV antigens present in the 
biological sample, 
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a means for detecting the immune complexes formed in the preceding 
binding reaction, 

possibly also including an automated scanning and interpretation device for 
inferring the HCV antigens present in the sample from the observed 
binding pattern. 

The present invention also relates to a composition comprising El and/or E2 and/or 
E1/E2 recombinant HCV proteins purified according to the method of the present 
invention or a composition comprising at least one peptides as specified above for use as a 
medicament. 

The present invention more particularly relates to a composition comprising at least 
one of the above-specified envelope peptides or a recombinant envelope protein 
composition as defined above, for use as a vaccine for immunizing a mammal, preferably 
humans, against HCV, comprising administering a sufficient amount of the composition 
possibly accompanied by pharmaeeutically acceptable adjuvant(s), to produce an immune 
response. 

More particularly, the present invention relates to the use of any of the 
compositions as described here above for the preparation of a vaccine as described above. 

Also, the present invention relates to a vaccine composition for immunizing a 
mammal, preferably humans, against HCV, comprising HCV single or specific oligomeric 
proteins or peptides derived from the El and/or the E2 region as described above. 

Immunogenic compositions can be prepared according to methods known in the 
art. The present compositions comprise an immunogenic amount of a recombinant E l 
and/or E2 and/or E1/E2 single or specific oligomeric proteins as denned above or El or E2 
peptides as defined above, usually combined with a pharmaeeutically acceptable carrier, 
preferably further comprising an adjuvant. 

The single or specific oligomeric envelope proteins of the present invention, either 
El and/or E2 and/or E1/E2, are expected to provide a particularly useful vaccine antigen, 
since the formation of antibodies to either El or E2 may be more desirable than to the other 
envelope protein, and since the E2 protein is cross-reactive between HCV types and the El 
protein is type-specific. Cocktails including type 1 E2 protein and El proteins derived from 
several genotypes may be particularly advantageous. Cocktails containing a molar excess 
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of El versus E2 or E2 versus El may also be particularly useful. Immunogenic 
compositions may be administered to animals to induce production of antibodies, either to 
provide a source of antibodies or to induce protective immunity in the animal. 

Pharmaceutically acceptable carriers include any carrier that does not itself induce 
the production of antibodies harmful to the individual receiving the composition. Suitable 
carriers are typically large, slowly metabolized macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid 
copolymers; and inactive virus particles. Such carriers are well known to those of ordinary 
skill in the art. 

Preferred adjuvants to enhance effectiveness of the composition include, but are not 
limited to : aluminim hydroxide (alum), N-ac^tyl-muramyl-L-mreonyl-D-isoglutamine 
(thr-MDP) as found in U.S. Patent No. 4,606,91 8, N-acetyl-normuramyl-L-alanyl-D- 
isoglutamine (nor-MDP), N-acetylmuramyl-L-alanyl-D4soglutarmnyl-L-alanine-2-(r-2'- 
dipalmitoyl-sn-glycero-3-hydroxyphosphoryIoxy)-ethylamine (MTP-PE) and RIBI, which 
contains three components extracted from bacteria, monophosphoryl lipid A, trehalose 
dimycolate, and cell wall skeleton (MPL+TDM+CWS) in a 2% squalene/Tween 80 
emulsion. Any of the 3 components MPL, TDM or CWS may also be used alone or 
combined 2 by 2. Additionally, adjuvants such as Stimulon (Cambridge Bioscience, 
Worcester, MA) or SAF-1 (Syntex) may be used. Further, Complete Freund's Adjuvant 
(CFA) and Incomplete Freund's Adjuvant (IF A) may be used for non-human applications 
and research purposes. 

The immunogenic compositions typically will contain pharmaceutically acceptable 
vehicles, such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary substances, 
such as wetting or emulsifying agents, pH buffering substances, preservatives, and the like, 
may be included in such vehicles. 

Typically, the immunogenic compositions are prepared as injectables, either as 
liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid 
vehicles prior to injection may also be prepared. The preparation also may be emulsified or 
encapsulated in liposomes for enhanced adjuvant effect. The El and E2 proteins may also 
be incorporated into Immune Stimulating Complexes together with saponins, for example 
Quil A (ISCOMS). 



-29- 



Immunogenic compositions used as vaccines comprise a 'sufficient amount' or 'an 
immunologically effective amount' of the envelope proteins of the present invention, as 
well as any other of the above mentioned components, as needed. 'Immunologically 
effective amount', means tiiat the administration of that amount to an individual, either in a 
single dose or as part of a series, is effective for treatment, as defined above. This amount 
varies depending upon the health and physical condition of the individual to be treated, the 
taxonomic group of individual to be treated (e.g. nonhuman primate, primate, etc.), the 
capacity of the individual's immune system to synthesize antibodies, the degree of 
protection desired, the formulation of the vaccine, the treating doctor's assessment of the 
medical situation, the strain of infecting HCV, and other relevant factors. It is expected that 
the amount will fall in a relatively broad range that can be determined through routine 
trials. Usually, the amount will vary from 0.01 to 1000 u.g/dose, more particularly from 0.1 
to 100 ug/dose. 

The single or specific oligomeric envelope proteins may also serve as vaccine 
carriers to present homologous (e.g. T cell epitopes or B cell epitopes from the core, NS2, 
NS3, NS4 or NS5 regions) or heterologous (non-HCV) haptens, in the same manner as 
Hepatitis B surface antigen (see European Patent ^Application 174,444). In this use, 
envelope proteins provide an immunogenic carrier capable of stimulating an immune 
response to haptens or antigens conjugated to the aggregate. The antigen may be 
conjugated either by conventional chemical methods, or may be cloned into the gene 
encoding El and/or E2 at a location corresponding to a hydrophilic region of the protein. 
Such hydrophylic regions include the VI region (encompassing amino acid positions 191 
to 202), the V2 region (encompassing amino acid positions 213 to 223), the V3 region 
(encompassing amino acid positions 230 to 242), the V4 region (encompassing amino acid 
positions 230 to 242), the V5 region (encompassing amino acid positions 294 to 303) and 
the V6 region (encompassing amino acid positions 329 to 336). Another useful location for 
insertion of haptens is the hydrophobic region (encompassing approximately amino acid 
positions 264 to 293). It is shown in the present invention that this region can be deleted 
without affecting the reactivity of the deleted El protein with antisera. Therefore, haptens 
may be inserted at the site of the deletion. 
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The immunogenic compositions are conventionally administered parenterally, 
typically by injection, for example, subcutaneously or intramuscularly. Additional 
formulations suitable for other methods of administration include oral formulations and 
suppositories. Dosage treatment may be a single dose schedule or a multiple dose schedule. 
Hie vaccine may be administered in conjunction with other immunoregulatory agents. 

The present invention also relates to a composition comprising peptides or 
polypeptides as described above, for in vitro detection of HCV antibodies present in a 
biological sample. 

The present invention also relates to the use of a composition as described above 
for the preparation of an immunoassay kit for detecting HCV antibodies present in a 
biological sample. 

The present invention also relates to a method for in vitro diagnosis of HCV 
antibodies present in a biological sample, comprising at least the following steps : 

(i) contacting said biological sample with a composition comprising any of the 
envelope peptide or proteins as defined above, preferably in an immobilized 
form under appropriate conditions wWch allow the formation of an immune 
complex, wherein said peptide or protein can be a biotinylated peptide or 
protein which is covalently bound to a solid substrate by means of 
streptavidin or avidin complexes, 

(ii) removing unbound components, 

(iii) incubating the immune complexes formed with heterologous antibodies, 
with said heterologous antibodies having conjugated to a detectable label 
under appropriate conditions, 

(iv) detecting the presence of said immune complexes visually or mechanically 
(e.g. by means of densitometry, fluorimetry, colorimetry). 

Alternatively, the present invention also relates to competition immunoassay 
formats in which recombinantly produced purified single or specific oligomeric protein El 
and/or E2 and/or E1/E2 proteins as disclosed above are used in combination with El 
and/or E2 peptides in order to compete for HCV antibodies present in a biological sample. 

The present invention also relates to a kit for determining the presence of HCV 
antibodies, in a biological sample, comprising : 
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at least one peptide or protein composition as defined above, possibly in 
combination with other polypeptides or peptides from HCV or other types 
of HCV, with said peptides or proteins being preferentially immobilized on 
a solid substrate, more preferably on different microwells of the same 
ELISA plate, and even more preferentially on one and the same membrane 
strip, 

a buffer or components necessary for producing the buffer enabling binding 
reaction between these polypeptides or peptides and the antibodies against 
HCV present in the biological sample, 

means for detecting the immune complexes formed in the preceding 
binding reaction, 

possibly also including an automated scarining and interpretation device for 
inferring the HCV genotypes present in the sample from the observed 
binding pattern. 

The immunoassay methods according to the present invention utilize single or 
specific oligomeric antigens from the El and/or E2 domains that maintain linear (in case of 
peptides) and conformational epitopes (single or specific oligomeric proteins) recognized 
by antibodies in the sera from individuals infected with HCV. It is within the scope of the 
invention to use for instance single or specific oligomeric antigens, dimeric antigens, as 
well as combinations of single or specific oligomeric antigens. The HCV El and E2 
antigens of the present invention may be employed in virtually any assay format that 
employs a known antigen to detect antibodies. Of course, a format that denatures the HCV 
conformational epitope should be avoided or adapted. A common feature of all of these 
assays is that the antigen is contacted with the body component suspected of containing 
HCV antibodies under conditions that permit the antigen to bind to any such antibody 
present in the component. Such conditions will typically be physiologic temperature, pH 
and ionic strenght using an excess of antigen. The incubation of the antigen with the 
specimen is followed by detection of immune complexes comprised of the antigen. 

Design of the immunoassays is subject to a great deal of variation, and many 
formats are known in the art. Protocols may, for example, use solid supports, or 
immunoprecipitation. Most assays involve the use of labeled antibody or polypeptide; the 
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labels may be, for example, enzymatic, fluorescent, chemiluminescent, radioactive, or dye 
molecules. Assays which amplify the signals from the immune complex are also known; 
examples of which are assays which utilize biotin and avidin or streptavidin, and enzyme- 
labeled and mediated immunoassays, such as ELISA assays. 

The immunoassay may be, without limitation, in a heterogeneous or in a 
homogeneous format, and of a standard or competitive type. In a heterogeneous format, the 
polypeptide is typically bound to a solid matrix or support to facilitate separation of the 
sample from the polypeptide after incubation. Examples of solid supports that can be used 
are nitrocellulose (e.g., in membrane or microtiter well form), polyvinyl chloride (e.g., in 
sheets or microtiter wells), polystyrene latex (e.g., in beads or microtiter plates, 
polyvinylidine fluoride (known as Immunolon™), diazotized paper, nylon membranes, 
activated beads, and Protein A beads. For example, Dynatech Immunolon™ 1 or 
Immunlon™ 2 microtiter plates or 0.25 inch polystyrene beads (Precision Plastic Ball) can 
be used in the heterogeneous format. The solid support containing the antigenic 
polypeptides is typically washed after separating it from the test sample, and prior to 
detection of bound antibodies. Both standard and competitive formats are know in the art. 

In a homogeneous format, the test sample is incubated with the combination of 
antigens in solution. For example, it may be under conditions that will precipitate any 
antigen-antibody complexes which are formed. Both standard and competitive formats for 
these assays are known in the art. 

In a standard format, the amount of HCV antibodies in the antibody-antigen 
complexes is directly monitored. This may be accomplished by determining whether 
labeled anti-xenogeneic (e.g. anti-human) antibodies which recognize an epitope on.anti- 
HCV antibodies will bind due to complex formation. In a competitive format, the amount 
of HCY antibodies in the sample is deduced by monitoring the competitive effect on the 
binding of a known amount of labeled antibody (or other competing ligand) in the 
complex. 

Complexes formed comprising anti-HCV antibody (or in the case of competitive 
assays, the amount of competing antibody) are detected by any of a number of known 
techniques, depending on the format. For example, unlabeled HCV antibodies in the 
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complex may be detected using a conjugate of anti-xenogeneic Ig complexed with a label 
(e.g. an enzyme label). 

In an immunoprecipitation or agglutination assay format the reaction between the 
HCV antigens and the antibody forms a network that precipitates from the solution or 
suspension and forms a visible layer or film of precipitate. If no anti-HCV antibody is 
present in the test specimen, no visible precipitate is formed. 

There currently exist three specific types of particle agglutination (PA) assays. 
These assays are used for the detection of antibodies to various antigens when coated to a 
support. One type of this assay is the hemagglutination assay using red blood cells (RBCs) 
that are sensitized by passively adsorbing antigen (or antibody) to the RBC. The addition 
of specific antigen antibodies present in the body component, if any, causes the RBCs 
coated with the purified antigen to agglutinate. 

To eliminate potential non-specific reactions in the hemagglutination assay, two 
artificial carriers may be used instead of RBC in the PA. The most common of these are 
latex particles. However, gelatin particles may also be used. The assays utilizing either of 
these carriers are based on passive agglutination of the particles coated With purified 
antigens. 

The HCV single or specififc oligomeric El and/or E2 and/or E1/E2 antigens of the 
present invention comprised of conformational epitopes will typically be packaged in the 
form of a kit for use in these immunoassays. The kit will normally contain in separate 
containers the native HCV antigen, control antibody formulations (positive and/or 
negative), labeled antibody when the assay format requires the same and signal generating 
reagents (e.g. enzyme substrate) if the label does not generate a signal directly. The native 
HCV antigen may be already bound to a solid matrix or separate with reagents for binding 
it to the matrix. Instructions (e.g. written, tape, CD-ROM, etc.) for carrying out the assay 
usually will be included in the kit. 

Immunoassays that utilize the native HCV antigen are useful in screening blood for 
the preparation of a supply from which potentially infective HCV is lacking. The method 
for the preparation of the blood supply comprises the following steps. Reacting a body 
component, preferably blood or a blood component, from the individual donating blood 
with HCV El and/or E2 proteins of the present invention to allow an immunological 
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reaction between HCV antibodies, if any, and the HCV antigen. Detecting whether anti- 
HCV antibody - HCV antigen complexes are formed as a result of the reacting. Blood 
contributed to the blood supply is from donors that do not exhibit antibodies to the native 
HCV antigens, El or E2. 
5 In cases of a positive reactivity to the HCV antigen, it is preferable to repeat the 

immunoassay to lessen the possibility of false positives. For example, in the large scale 
screening of blood for the production of blood products (e.g. blood transfusion, plasma, 
Factor VIII, immunoglobulin, etc.) 'screening' tests are typically formatted to increase 
sensitivity (to insure no contaminated blood passes) at the expense of specificity; i.e. the 

l o false-positive rate is increased. Thus, it is typical to only defer for further testing those 

donors who are 'repeatedly reactive'; i.e. positive in two or more runs of the immunoassay 
on the donated sample. However, for confirmation of HCV-positivity, the 'confirmation' 
tests are typically formatted to increase specificity (to insure that no false-positive samples 
are confirmed) at the expense of sensitivity. Therefore the purification method described in 

1 5 the present invention for El and E2 will be very advantageous for including single or 
specific oligomeric envelope proteins into HCV diagnostic assays. 

The solid phase selected can include polymeric or glass beads, nitrocellulose, 
micrOparticles, microwells of a reaction tray, test tubes and magnetic beads. The signal 
generating compound can include an enzyme, a luminescent compound, a chromogen, a 

20 radioactive element and a chemiluminescent compound. Examples of enzymes include 

alkaline phosphatase, horseradish peroxidase and beta-galactosidase. Examples of enhancer 
compounds include biotin, anti-biotin and avidin. Examples of enhancer compounds 
binding members include biotin, anti-biotin and avidin. In order to block the effects of 
rheumatoid factor-like substances, the test sample is subjected to conditions sufficient to 

25 block the effect of rheumatoid factor-like substances. These conditions comprise 

contacting the test sample with a quantity of anti-human IgG to form a mixture, and 
incubating the mixture for a time and under conditions sufficient to form a reaction mixture 
product substantially free of rheumatoid factor-like substance. 

The present invention further contemplates the use of El proteins, or parts thereof, 

30 more particularly HCV single or specific oligomeric El proteins as defined above, for in 
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vitro monitoring HCV disease or prognosing the response to treatment (for instance with 
Interferon) of patients suffering from HCV infection comprising: 

incubating a biological sample from a patient with hepatitis C infection 

with an El protein or a suitable part thereof under conditions allowing the 

formation of an immunological complex, 

removing unbound components, 

calculating the anti-El titers present in said sample (for example at the start 
of and/or during the course of (interferon) therapy), 

monitoring the natural course of HCV disease, or prognosing the response 
to treatment of said patient on the basis of the amount anti-El titers found 
in said sample at the start of treatment and/or during the course of 
treatment. 

Patients who show a decrease of 2, 3, 4, 5, 7, 10, 15, or preferably more than 20 
times of the initial anti-El titers could be concluded to be long-term, sustained responders 
to HCV therapy, more particularly to interferon therapy. It is illustrated in the Examples 
section, that an anti-El assay may be very useful for prognosing long-term response to IFN 
treatment, or to treatment of Hepatitis C virus disease in general. 

More particularly the following El peptides as listed in Table 3 were found to be 
useful for in vitro monitoring HCV disease or prognosing the response to interferon 
treatment of patients suffering from HCV infection: 

Ei-31 (SEQ ID NO 56) spanning amino acids 181 to 200 of the Core/El VI 

region, • 

El -33 (SEQ ID NO 57) spanning amino acids 193 to 212 of the El region, 
El -35 (SEQ ID NO 58) spanning amino acids 205 to 224 of the El V2 region 
(epitope B), 

E1-35A (SEQ ID NO 59) spanning amino acids 208 to 227 of the El V2 region 
(epitope B), 

IbEl (SEQ ID NO 53) spanning amino acids 192 to 228 of El regions (VI , CI, 
and V2 regions (containing epitope B)), 

El -5 1 (SEQ ID NO 66) spanning amino acids 301 to 320 of the El region, 
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El -53 (SEQ ID NO 67) spanning amino acids 3 13 to 332 of the El C4 region 
(epitope A), 

El -55 (SEQ ID NO 68) spanning amino acids 325 to 344 of the El region. 

It is to be understood that smaller fragments of the above-mentioned peptides also 
fall within the scope of the present invention. Said smaller fragments can be easily 
prepared by chemical synthesis and can be tested for their ability to be used in an assay as 
detailed above and in the Examples section. 

The present invention also relates to a kit for monitoring HCV disease or 
prognosing the response to treatment (for instance to interferon) of patients suffering from 
HCV infection comprising: 

at least one El protein or El peptide, more particularly an El protein or El 
peptide as defined above, 

a buffer or components necessary for producing the buffer enabling the 
binding reaction between these proteins or peptides and the anti-El 
antibodies present in a biological sample, 

means for detecting the immune complexes formed in the preceding 
binding reaction, 

possibly also an automated scanning and interpretation device for inferring 
a decrease of anti-El titers during the progression of treatment. 
It is to be understood that also E2 protein and peptides according to the present 
invention can be used to a certain degree to monitor/pro gnose HCV treatment as indicated 
above for the El proteins or peptides because also the anti-E2 levels decrease in 
comparison to antibodies to the other HCV antigens. It is to be understood, however, that it 
might be possible to determine certain epitopes in the E2 region which would also be 
suited for use in an test for monitoring/prognosing HCV disease. 

The present invention also relates to a serotyping assay for detecting one or more 
serological types of HCV present in a biological sample, more particularly for detecting 
antibodies of the different types of HCV to be detected combined in one assay format, 
comprising at least the following steps : 

(i) contacting the biological sample to be analyzed for the presence of HCV 
antibodies of one or more serological types, with at least one of the El 
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and/or E2 and/or E1/E2 protein compositions or at least one of the El or E2 
peptide compositions as defined above, preferentially in an immobilized 
form under appropriate conditions which allow the formation of an immune 
complex, 

(ii) removing unbound components, 

(iii) incubating the immune complexes formed with heterologous antibodies, 
with said heterologous antibodies being conjugated to a detectable label 
under appropriate conditions, 

(iv) detecting the presence of said immune complexes visually or mechanically 
(e.g. by means of densitometry , fluorimetry, colorimetry) and inferring the 
presence of one or more HCV serological types present from the observed 
binding pattern. 

It is to be understood that the compositions of proteins or peptides used in this 
method are recombinantly expressed type-specific envelope proteins or type-specific 
peptides. 

The present invention further relates to a kit for serotyping one or more serological 
types of HCV present in a biological sample, more particularly for detecting the antibodies 
to these serological types of HCV comprising: 

at least one El and/or E2 and/or E17E2 protein or El or E2 peptide, as 

defined above, 

a buffer or components necessary for producing the buffer enabling the 
binding reaction between these proteins or peptides and the anti-El 
antibodies present in a biological sample, 

means for detecting the immune complexes formed in the preceding 
binding reaction, 

possibly also an automated scanning and interpretation device for detecting 
the presence of one or more serological types present from the observed 
binding pattern. 

The present invention also relates to the use of a peptide or protein composition as 
defined above, for immobilization on a solid substrate and incorporation into a reversed 
phase hybridization assay, preferably for immobilization as parallel lines onto a solid 
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support such as a membrane strip, for determining the presence or the genotype of HCV 
according to a method as defined above. Combination with other type-specific antigens 
from other HCV polyprotein regions also lies within the scope of the present invention. 
Figure and Table legends 
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Figure 1 
Figure 2 
Figure 3 
Figure 4 
Figure 5 
Figure 6 
Figure 7 
Figure 8 
Figure 9 
Figure 10 
Figure 11 
Figure 1 2 . 
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Figure 20: 
Figure 2 1 : 
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Restriction map of plasmid pgpt ATA 1 8 

Restriction map of plasmid pgs ATA 1 8 

Restriction map of plasmid pMS 66 

Restriction map of plasmid pv HCV-1 1 A 

Anti-El levels in non-responders to IFN treatment 

Anti-El levels in responders to IFN treatment 

Anti-El levels in patients with complete response to IFN treatment 

Anti-El levels in incomplete responders to IFN treatment 

Anti-E2 levels in non-responders to IFN treatment 

Anti-E2 levels in responders to IFN treatment 

Anti r E2 levels in incomplete responders to IFN treatment 

Anii-E2 leveLs in complete responders to IFN treatment 

Human anti-El reactivity competed with peptides 

Competition of reactivity of anti-El monoclonal antibodies with peptides 
Anti-El (epitope 1) levels in non-responders to IFN treatment 
Anti-El (epitope 1) levels in responders to IFN treatment 
Anti-El (epitope 2) levels in non-responders to IFN treatment 
Anti-El (epitope 2) levels in responders to IFN treatment 
Competition of reactivity of anti-E2 monoclonal antibodies with peptides 
Human anti-E2 reactivity competed with peptides 

Nucleic acid sequences of the present invention. The nucleic acid sequences 
encoding an El or E2 protein according to the present invention may be 
translated (SEQ ID NO 3 to 13, 21-31, 35 and 41-49 are translated in a 
reading frame starting from residue number 1, SEQ ID NO 37-39 are 
translated in a reading frame starting from residue number 2), into the 
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amino acid sequences of the respective El or E2 proteins as shown in the 
sequence listing. 

Figure 22: ELISA results obtained from lentil lectin chromatography eluate fractions 
of 4 different El purifications of cell lysates infected with wHCV39 (type 
lb), wHCV40 (type lb), vvHCV62 (type 3a), and wHCV63 (type 5a). 
Figure 23 : Elution profiles obtained from the lentil lectin chromatography of the 4 
different El constructs on the basis of the values as shown in Figure 22. 
Figure 24: ELISA results obtained from fractions obtained after gelfiltration 

chromatography of 4 different El purifications of cell lysates infected with 
vvHCV39 (type lb), wHCV40 (type lb), vvHCV62 (type 3a), and 
wHCV63 (type 5a). 

Figure 25: Profiles obtained from purifications of El proteins of type lb (1), type 3a 
(2), and type 5a (3) (from RK13 cells infected with vvHCV39, wHCV62, 
and wHCV63, respectively; purified on lentil lectin and reduced as in 
example 5.2 - 5.3) and a standard (4). The peaks indicated with '1', '2', and 
'3', represent pure El protein peaks (see Figure 24, El reactivity mainly in 
fractions 26 to 30). 

Figure 26 : Silver staining of an SDS-PAGE as described in example 4 of a raw lysate 
of El wHCV40 (type lb) (lane 1), pool 1 of the gelfiltration of wHCV40 
representing fractions 10 to 17 as shown in Figure 25 (lane 2), pool 2 of the 
gelfiltration of wHCV40 representing fractions 1 8 to 25 as shown in 
Figure 25 (lane 3), and El pool (fractions 26 to 30) (lane 4). 
Figure 27: Streptavidine- alkaline phosphatase blot of the fractions of the gelfiltration 
of El constructs 39 (type lb) and 62 (type 3a). The proteins were labelled 
with NEM-biotin. Lane 1 : start gelfiltration construct 39, lane 2: fraction 26 
construct 39, lane 3 : fraction 27 construct 39, lane 4: fraction 28 construct 
39, lane 5: fraction 29 construct 39, lane 6: fraction 30 construct 39, lane 7 
fraction 3 1 construct 39, lane 8: molecular weight marker, lane 9: start 
gelfiltration construct 62, lane 10: fraction 26 construct 62, lane 11: fraction 
30 27 construct 62, lane 1 2: fraction 28 construct 62, lane 13: fraction 29 



20 



25 



-40- 



• • • • • 
ft 

• • • 



construct 62, lane 14: fraction 30 construct 62, lane 1 5: fraction 31 
construct 62. 

Figure 28 : Siver staining of an SDS-PAGE gel of the gelfiltration fractions of wHCV- 
39 (E 1 s, type 1 b) and wHCV-62 (El s, type 3a) run under identical 
5 conditions as Figure 26. Lane 1 : start gelfiltration construct 39, lane 2: 

fraction 26 construct 39, lane 3: fraction 27 construct 39, lane 4: fraction 28 
construct 39, lane 5: fraction 29 construct 39, lane 6: fraction 30 construct 
39, lane 7 fraction 31 construct 39, lane 8: molecular weight marker, lane 9: 
start gelfiltration construct 62, lane 10: fraction 26 construct 62, lane 1 1 : 
•0 fraction 27 construct 62, lane 12: fraction 28 construct 62, lane 13: fraction 

29 construct 62, lane 14: fraction 30 construct 62, lane 15: fraction 31 
construct 62. 

Figure 29: Western Blot analysis with anti-El mouse monoclonal antibody 5E1A10 
giving a complete overview of the purification procedure. Lane 1 : crude 
' 5 lysate, Lane 2: flow through of lentil chromagtography, Lane 3 : wash with 

Empigen BB after lentil chromatography, Lane 4: Eluate of lentil 
chromatography, Lane 5: Flow through during concentration of the lentil 
eluate, Lane 6: Pool of El after Size Exclusion Chromatography 
(gelfiltration). 

20 Figure 30: OD^ profile (continuous line) of the lentil lectin chromatography of E2 
protein from RK13 cells infected with wHCV44. The dotted line 
represents the E2 reactivity as detected by ELISA (as in example 6). 
Figure 3 1 A: OD 2g0 profile (continuous line) of the lentil-lectin gelfiltration 

chromatography E2 protein pool from RK13 cells infected with wHCV44 

25 in which the E2 pool is applied immediately on the gelfiltration column 

(non-reduced conditions). The dotted line represents the E2 reactivity as 
detected by ELISA (as in example 6). 
Figure 3 1 B : OD 2g0 profile (continuous line) of the lentil-lectin gelfiltration 

chromatography E2 protein pool from RK1 3 cells infected with wHCV44 

30 in which the E2 pool was reduced and blocked according to Example 5.3 
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(reduced conditions). The dotted line represents the E2 reactivity as 
detected by ELISA (as in example 6). 

Figure 32: Ni 2+ -IMAC chromatography and ELISA reactivity of the E2 protein as 

expressed from wHCV44 after gelfiltration under reducing conditions as 
5 shown in Figure 3 IB. 

Figure 33 : Silver staining of an SDS-PAGE of 0.5 ug of purified E2 protein recovered 
by a 200 mM imidazole elution step (lane 2) and a 30mM imidazole wash 
(lane 1) of the Ni 2+ -IMAC chromatography as shown in Figure 32. 

Figure 34: OD profiles of a desalting step of the purified E2 protein recovered by 200 
1 0 mM immidazole as shown in Figure 33, intended to remove imidazole. 

Figure 3 5 A: Antibody levels to the different HCV antigens (Core 1 , Core 2, E2HCVR, 
NS3) for NR and LTR followed during treatment and over a period of 6 to 
12 months after treatment determined by means of the LIAscan method. 
The average values are indicated by the curves with the open squares. 
1 5 Figure 35B: Antibody levels to the different HCV antigens (NS4, NS5, El and E2) for 
NR and LTR followed during treatment and over a period of 6 to 12 
months after treatment determined by means of the LIAscan method. The 
avergae vallues are indicated by the curve with the open squares. 

20 Figure 36: Average El antibody (El Ab) and E2 antibody (E2Ab) levels in the LTR 

and-NR groups. 

Figure 37: Averages El antibody (ElAb) levels for non-responders (NR) and long 

term respondent (LTR) for type lb and type 3a. 
Figure 38 : Relative map positions of the anti-E2 monoclonal antibodies. 
25 Figure 39: Partial deglycosylation of HCV El envelope protein. The lysate of 

wHCVlOA-infected RK13 cells were incubated with different 

concentrations of glycosidases according to the manufacturer's instructions. 

Right panel: Glycopeptidase F (PNGase F). Left panel: Endoglycosidase H 

(EndoH). 

30 Figure 40: Partial deglycosylation of HCV E2 envelope proteins. The lysate of 

wHCV64-infected (E2) and wHCV41 -infected (E2s)RK13 cells were 
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incubated with different concentrations of Glycopeptidase F (PNGase F) 
according to the manufacturer's instructions. 
Figure 41 : In vitro mutagenesis of HCV El glycoproteins. Map of the mutated 
sequences and the creation of new restriction sites. 
5 Figure 42A: In vitro mutagenesis of HCV El glycoprotein (part 1). First step of PCR 
amplification. 

Figure 42B: In vitro mutagensis of HCV El glycoprotein (part 2). Overlap extension 
and nested PCR. 

Figure 43: In vitro mutagesesis of HCV El glycoproteins. Map of the PCR mutated 
l o fragments (GLY-# and OVR-#) synthesized during the first step of 

amplification. 

Figure 44 A: Analysis of El glycoprotein mutants by Western blot expressed in HeLa 

(left) and RK13 (right) cells. Lane 1 : wild type W (vaccinia virus), Lane 2: 
original El protein (wHCV-lOA), Lane 3: El mutant Gly-1 (wHCV-81), 

1 5 Lane 4: El mutant Gly-2 (wHCV-82), Lane 5: El mutant Gly-3 (wHCV- 

83), Lane 6: El mutant Gly-4 (wHCV-84), Lane 7: El mutant Gly-5 
(wHCV-85), Lane 8: El mutant Gly-6 (wHCV-86). 
Figure 44B: Analysis of El glycosylatiOn mutant vaccinia viruses by PCR 

amplification/restriction. Lane 1 : El (wHCV- 1 OA), BspE I, Lane 2: 

20 . El.GLY-1 (vvHCV-81), BspE I, Lane 4: El (wHCV-lOA), Sac I, Lane 5: 

El. GLY-2 (wHCV-82), Sac I, Lane 7: El (wHCV-lOA), Sac I, Lane 8: 
El.GLY-3 (wHCV-83),5ac7, Lane 10: El (wHCV- 1 OA), Stu I, Lane 11: 
El .GLY-4 (wHCV-84), Stu I, Lane 13: El (wHCV-lOA), Sma I, Lane 14: 
El.GLY-5 (wHCV-85), Sma I, Lane 16: El (wHCV-lOA), Stu I, Lane 17: 

25 El.GLY-6 (wHCV-86), Stu I, Lane 3 - 6 - 9 - 12 - 15 : Low Molecular 

Weight Marker, pBluescript SK+, Msp I. 
Figure 45: SDS polyacrylamide gel electrophoresis of recombinant E2 expressed in 
cerevisiae . Innoculates were grown in leucine selective medium for 72 hrs. 
and diluted 1/15 in complete medium. After 10 days of culture at 28°C, 

30 medium samples were taken. The equivalent of 200 \i\ of culture 



-43 - 



10 



15 



20 



Figure 46 



Table 1 



Table 2 
Table 3 
Table 4 
Table 5 
Table 6 
Table 7 
Table 8 



supernatant concentrated by speedvac was loaded on the gel. Two 
independent transformants were analysed. 

SDS polyacrylamide gel electrophoresis of recombinant E2 expressed in a 
glycosylation deficient S: cerevisiae mutant. Innoculae were grown in 
leucine selective medium for 72 hrs. and diluted 1/1 5 in complete medium. 
After 10 days of culture at 28°C, medium samples were taken. The 
equivalent of 350 u.1 of culture supernatant, concentrated by ion exchange 
chromatography, was loaded on the gel. 

Features of the respective clones and primers used for amplification for 
constructing the different forms of the E l protein as despected in 
Example 1. 

Summary of Anti-El tests 
Synthetic peptides for competition studies 
Changes of envelope antibody levels over time. 
Difference between LTR and NR 

Competition experiments between murine E2 monoclonal antibodies 
Primers for construction of El glycosylation mutants 
Analysis of El glycosylation mutants by ELISA 
Example 1: Cloning and expression of the hepatitis C virus El protein 



1 . Construction of vaccinia virus recombination vectors 



25 



30 



The pgptATA18 vaccinia recombination plasmid is a modified version of pATA18 
(Stunnenberg et al, 1988) with an additional insertion containing the E. coli xanthine 
guanine phosphoribosyl transferase gene under the control of the vaccinia virus 13 
intermediate promoter (Figure 1). The plasmid pgsATA18 was constructed by inserting an 
oligonucleotide linker with SEQ ID NO 1/94, containing stop codons in the three reading 
frames, into the Pst I and Hindlll-cut pATA18 vector. This created an extra Pac I 
restriction site (Figure 2). The original Hindlll site was not restored. 
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Oligonucleotide linker with SEQ ID NO 1/94: 

5 ' G GCATGC AAGCTT AATTAATT 3 1 

3 ' ACGTC CGTACG TTCGAA TTAATTAA TCGA 5 ' 

PstI SphI Hindi 1 1 Pac I (HindHI) 

• 2+ 

In order to facilitate rapid and efficient purification by means of Ni chelation of 
engineered histidine stretches fused to the recombinant proteins, the vaccinia 
recombination vector pMS66 was designed to express secreted proteins with an additional 
carboxy-terminal histidine tag. An oligonucleotide linker with SEQ ID NO 2/95, 
containing unique sites for 3 restriction enzymes generating blunt ends (Sma I, Stu I and 
Pml I/Bbr PI) was synthesized in such a way that the carboxy-tenninal end of any cDNA 
could be inserted in frame with a sequence encoding the protease factor Xa cleavage site 
followed by a nucleotide sequence encoding 6 histidines and. 2 stop codons (a new Pac I 
restriction site was also created downstream the 3 'end). This oligonucleotide with SEQ ID 
NO 2/95 was introduced between the Xma l and Pst I sites of pgptATAl 8 (Figure 3). 

Oligonucleotide linker with SEQ ID NO 2/95 : 

CCGGG C^GGCCTGCACGTGATCGAGGGCAGACACCATCACCACCATCTCTAATMTTAATTAA CTGCA 3 ' 

C CTTCXXSGACGTGCACTAGCTCCCGTCTGTGGTAGTGGTGGTAGTGAT^ G 



Xmal 



PstI 
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Example 2. Construction of HCV recombinant plasmids 

2. 1 . Constructs encoding different forms of the El protein 

5 Polymerase Chain Reaction (PCR) products were derived from the serum samples 

by RNA preparation and subsequent reverse-transcription and PCR as described previously 
(Stuyver et al., 1993b). Table 1 shows the features of the respective clones and the primers 
used for amplification. The PCR fragments were cloned into the Sma I-cut pSP72 
(Promega) plasmids. The following clones were selected for insertion into vaccinia 

10 reombination vectors: HCC19A (SEQ ID NO 3), HCC110A (SEQ ID NO 5), HCC11 1A 
(SEQ ID NO 7), HCC112A (SEQ ID NO 9), HCC113A (SEQ ID NO 1 1), and HCC117A 
(SEQ ID NO 1 3) 'as depicted in Figure 21 . cDNA fragments containing the El -coding 
regions were cleaved by EcoRI and Hindlll restriction from the respective pSP72 plasmids 
and inserted into the EcoRI/Hindlll-cut pgptATA-1 8 vaccinia recombination vector 

1 5 (described in example 1), downstream of the 1 IK vaccinia virus late promoter. The 

respective plasmids were designated pvHCV-9A, pvHCV-lOA, pvHCV-1 1 A, pvHCV- 
12 A, pvHCV-13A andpvHCV-17A, of which pvHCV-1 1A is shown in Figure 4. 

2.2. Hydrophobic region El deletion mutants 

20 

Clone HCC137, containing a deletion of codons Asp264 to Val287 (nucleotides 
790 to 861 , region encoding hydrophobic domain I) was generated as follows: 2 PCR 
fragments were generated from clone HCC110A with primer sets HCPr52 (SEQ ID NO 
16)/HCPrl07 (SEQ ID NO 19) and HCPrl08 (SEQ ID NO 20)/HCPR54 (SEQ ID NO 1 8). 

25 These primers are shown in Figure 21 . The two PCR fragments were purified from agarose 
gel after electrophoresis and 1 ng of each fragment was used together as template for PCR 
by means of primers HCPr52 (SEQ ID NO 16) and HCPr54 (SEQ ID NO 18). The 
resulting fragment was cloned into the Sma I-cut pSP72 vector and clones containing the 
deletion were readily identified because of the deletion of 24 codons (72 base pairs). 

30 Plasmid pSP72HCC137 containing clone HCC137 (SEQ ID 15) was selected. A 

recombinant vaccinia plasmid containing the full-length El cDNA lacking hydrophobic 
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domain I was constructed by inserting the HCV sequence surrounding the deletion 
(fragment cleaved by Xma I and BamH I from the vector pSP72-HCC137) into the Xma I- 
Bam H I sites of the vaccinia plasmid pvHCV-lOA. The resulting plasmid was named 
pvHCV-37. After confirmatory sequencing, the armno-terminal region containing the 
5 internal deletion was isolated from this vector pvHCV-37 (cleavage by EcoR I and BstE II) 
and reinserted into the Eco RJ and Bst Ell-cut pvHCV-1 1A plasmid. This construct was 
expected to express an El protein with both hydrophobic domains deleted and was named 
pvHCV-38. The El-coding region of clone HCC138 is represented by SEQ ID NO 23. 
As the hydrophilic region at the El carboxyterminus (theoretically extending to 

1 0 around amino acids 337-340) was not completely included in construct pvHCV-38, a 

larger El region lacking hydrophobic domain I was isolated from the pvHCV-37 plasmid 
by EcoR I/Bam HI cleavage and cloned into an EcoRI/B amHI-cut pgsATA-18 vector. The 
resulting plasmid was named pvHCV-39 and contained clone HCC139 (SEQ ID NO 25). 
The same fragment was cleaved from the pvHCV-37 vector by BamH I (of which the 

1 5 sticky ends were filled with Klenow DNA Polymerase I (Boehringer)) and subsequently by 
EcoR I (5' cohesive end). This sequence was inserted into the EcoRI and Bbr PI -cut vector 
pMS-66. This resulted in clone HCC140 (SEQ ID NO 27) in plasmid pvHCV-40, 
containing a 6 histidine tail at its carboxy -terminal end. 

20 2.3. El of other genotypes 



Clone HCC162 (SEQ ID NO 29) was derived from a type 3a-infected patient with 
chronic hepatitis C (serum BR36, clone BR36-9-13, SEQ ID NO 19 in WO 94/25601, and 
see also Stuyver et al. 1 993a) and HCC163 (SEQ ID NO 31) was derived from a type 5a- 
25 infected child with post-transfusion hepatitis (serum BE95, clone PC-4-1 , SEQ ID NO 45 
in WO 94/25601). 

2.4. E2 constructs 

30 The HCV E2 PCR fragment 22 was obtained from serum BE1 1 (genotype 1 b) by 

means of primers HCPrl09 (SEQ ID NO 33) and HCPr72 (SEQ ID NO 34) using 
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techniques of RNA preparation, reverse-transcription and PCR, as described in Stuyver et 
al., 1993b, and the fragment was cloned into the Sma I-cut pSP72 vector. Clone HCC122A 
(SEQ ID NO 35) was cut with NcoI/AJwNI or by BamHI/AlwNI and the sticky ends of the 
fragments were blunted (Ncol and BamHI sites with Klenow DNA Polymerase I 
(Boehringer), and AlwNT with T4 DNA polymerase (Boehringer)). The BamHI/AlwNI 
cDNA fragment was then inserted into the vaccinia pgsATA- 18 vector that had been 
linearized by EcoR I and Hind III cleavage and of which the cohesive ends had been filled 
with Klenow DNA Polymerase (Boehringer). The resulting plasmid was named pvHCV- 
41 and encoded the E2 region from amino acids Met347 to Gln673, including 37 amino 
acids (from Met347 to Gly383) of the El protein that can serve as signal sequence. The 
same HCV cDNA was inserted into the EcoR I and Bbr Pi-cut vector pMS66, that had 
subsequently been blunt ended with Klenow DNA Polymerase. The resulting plasmid was 
named pvHCV-42 and also encoded amino acids 347 to 683. The NcoI/AlwNT fragment 
was inserted in a similar way into the same sites of pgsATA-18 (pvHCV-43) or pMS-66 
vaccinia vectors (pvHCV-44). pvHCV-43 and pvHCV-44 encoded amino acids 364 to 673 
of the HCV polyprotein, of which amino acids 364 to 383 were derived from the natural 
carboxyterminal region of the El protein encoding the signal sequence for E2, and amino 
acids 384 to 673 of the mature E2 protein. 

2.5. Generation of recombinant HCV-vaccinia viruses 

Rabbit kidney RK13 cells (ATCC CCL 37), human osteosarcoma 1 43B thymidine 
kinase deficient (TK") (ATCC CRL 8303), HeLa (ATCC CCL 2), and Hep G2 (ATCC HB 
8065) cell lines were obtained from the American Type Culture Collection (ATCC, 
Rockville, Md, USA). The cells were grown in Dulbecco's modified Eagle medium 
(DMEM) supplemented with 10 % foetal calf serum, and with Earle's salts (EMEM) for 
RK13 and 143 B (TK-), and with glucose (4 g/1) for Hep G2. The vaccinia virus WR strain 
(Western Reserve, ATTC VR1 19) was routinely propagated in either 143B or RK13 cells, 
as described previously (Panicali & Paoletti, 1982; Piccini et al., 1987; Mackett et al., 
1982, 1 984, and 1986). A confluent monolayer of 143B cells was infected with wild type 
vaccinia virus at a multiplicity of infection (m.o.i.) of 0. 1 (= 0.1 plaque forming unit (PFU) 
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per cell). Two hours later, the vaccinia recombination plasmid was transfected into the 
infected cells in the form of a calcium phosphate coprecipitate containing 500 ng of the 
plasmid DNA to allow homologous recombination (Graham & van der Eb, 1973; Mackett 
et al., 1985). Recombinant viruses expressing the Escherichia coli xanthine-guanine 
phosphoribosyl transferase (gpt) protein were selected on rabbit kidney RK1 3 cells 
incubated in selection medium (EMEM containing.25 ug/ml mycophenolic acid (MP A), 
250 u-g/ml xanthine, and 15 u.g/ml hypoxanthine; Falkner and Moss, 1988; Janknecht et al, 
1991). Single recombinant viruses were purified on fresh monolayers of RK13 cells under 
a 0.9% agarose overlay in selection medium. Thymidine kinase deficient (TK~) 
recombinant viruses were selected and then plaque purified on fresh monolayers of human 
143B cells (TK-) in the presence of 25 ug/ml 5-bromo-2'-deoxyuridine. Stocks of purified 
recombinant HCV-vaccinia viruses were prepared by infecting either human 1 43 B or 
rabbit RK13 cells at an m.o.i. of 0.05 (Mackett et al, 1988). The insertion of the HCV 
cDNA fragment in the recombinant vaccinia viruses was confirmed on an aliquot (50 pi) of 
the cell lysate after the MP A selection by means of PCR with the primers used to clone the 
respective HCV fragments (see Table 1). The recombinant vaccinia-HCV viruses were 
named according to the vaccinia recombination plasmid number, e.g. the recombinant 
vaccinia virus wHCV-1 OA was derived from recombining the wild type WR strain with 
the pvHCV-1 OA plasmid. 

Example 3: infection of cells with recombinant vaccinia viruses 

A confluent monolayer of RK1 3 cells was infected at a m.o.i. of 3 with the 
recombinant HCV-vaccinia viruses as described in example 2 . For infection, the cell 
monolayer was washed twice with phosphate-buffered saline pH 7.4 (PBS) and the 
recombinant vaccinia virus stock was diluted in MEM medium. Two hundred pi of the 
virus solution was added per 10 6 cells such that the m.o.i. was 3, and incubated for 45 min 
at 24°C. The virus solution was aspirated and 2 ml of complete growth medium (see 
example 2) was added per 1 0 6 cells. The cells were incubated for 24 hr at 37°C during 
which expression of the HCV proteins took place. 
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Example 4: Analysis of recombinant proteins by means of western blotting 

5 The infected cells were washed two times with PBS, directly lysed with lysis buffer 

(50 mM Tris.HCl pH 7.5, 150 mM NaCl, 1% Triton X-100, 5 mM MgCl 2 , 1 ng/ml 
aprotinin (Sigma, Bornem, Belgium)) or detached from the flasks by incubation in 50 mM 
Tris.HCL pH 7.5/ 10 mM EDTA/ 150 mM NaCl for 5 min, and collected by centrifugation 
(5 min at 1 OOOg). The cell pellet was then resuspended in 200 u.1 lysis buffer (50 mM 

10 Tris.HCL pH 8.0, 2 mM EDTA, 1 50 mM NaCl, 5 mM MgCl 2 , aprotinin, 1% Triton X- 
1 00) per 1 0 6 cells. The cell lysates were cleared for 5 min at 14,000 rpm in an Eppendorf 
centrifuge to remove the insoluble debris. Proteins of 20 u.1 lysate were separated by means 
of sodium dodecyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE). The proteins 
were then electro-transferred from the gel to a nitrocellulose sheet (Amersham) using a 

1 5 Hoefer HSI transfer unit cooled to 4°C for 2 hr at 1 00 V constant voltage, in transfer buffer 
(25 mM Tris.HCl pH 8.0, 1 92 mM glycine, 20% (v/v) methanol). Nitrocellulose filters 
were blocked with Blotto (5 % (w/v) fat-free instant milk powder in PBS; Johnson et al., 
1981) and incubated with primary antibodies diluted in Blotto/0.1 % Tween 20. Usually, a 
human negative control serum or serum of a patient infected with HCV were 200 times 

20 diluted and preincubated for 1 hour at room temperature with 200 times diluted wild type 
vaccinia virus-infected cell lysate in order to decrease the non-specific binding. After 
washing with Blotto/0.1% Tween 20, the nitrocellulose filters were incubated with alkaline 
phosphatase substrate solution diluted in Blotto/0.1 % Tween 20. After washing with 0.1% 
Tween 20 in PBS, the filters were incubated with alkaline phosphatase substrate solution 

25 (1 00 mM Tris.HCl pH 9.5, 1 00 mM NaCl, 5 mM MgCl 2 , 0,38 p.g/ml nitroblue tetrazolium, 
0.165 fig/ml 5-bromo-4-chloro-3 -indolylphosphate) . All steps, except the electrotransfer, 
were performed at room temperature. 



30 
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Example 5: Purification of recombinant El or E2 protein 

5.1. Lysis 

Infected RK13 cells (carrying El or E2 constructs) were washed 2 times with 
phosphate-buffered saline (PBS) and detached from the culture recipients by incubation in 
PBS containing 10 mM EDTA. The detached cells were washed twice with PBS and 1 ml 
of lysis buffer (50 mM Tris.HCl pH 7.5, 150 mMNaCl, 1% Triton X- 100, 5 mM MgCl 2 , 1 
(j.g/ml aprotinin (Sigma, Bomera, Belgium) containing 2 mM biotinylated N- 
ethylmaleimide (biotin-NEM) (Sigma) was added per 10 s cells at 4°C. This lysate was 
homogenized with a type B douncer and left at room temperature for 0.5 hours. Another 5 
volumes of lysis buffer containing 10 mM N-ethylmaleimide (NEM, Aldrich, Bornem, 
Belgium) was added to the primary lysate and the mixture was left at room temperature for 
15 min. Insoluble cell debris was cleared from the solution by centrifugation in a Beckman 
JA-14 rotor at 14,000 rpm (30100 g at r max ) for 1 hour at 4°C. 

5.2. Lectin Chromatography 

The cleared cell lysate was loaded at a rate of lml/rnin on a 0.8 by 1 0 cm Lentil- 
20 lectin Sepharose 4B column (Pharmacia) that had been equilibrated with 5 column 

volumes of lysis buffer at a rate of 1 ml/min. The lentil-lectin column was washed with 5 to 
10 column volumes of buffer 1 (0.1M potassium phosphate pH 7.3, 500 mM KC1, 5% 
glycerol, 1 mM 6-NH 2 -hexanoic acid, 1 mM MgCl 2 , and 1% DecylPEG (KWANT, 
Bedum, The Netherlands). In some experiments, the column was subsequently washed 
25 with 10 column volumes of buffer 1 containing 0.5% Empigen-BB (Calbiochem, San 

Diego, CA, USA) instead of 1% DecylPEG. The bound material was eluted by applying 
elution buffer (10 mM potassium phosphate pH 7.3, 5% glycerol, 1 mM hexanoic acid, 
ImM MgCl 2 , 0.5% Empigen-BB, and 0.5 M a-methyl-marmopyranoside). The eluted 
material was fractionated and fractions were screened for the presence of El or E2 protein 
30 by means of ELISA as described in example 6. Figure 22 shows ELISA results obtained 

from lentil lectin eluate fractions of 4 different El purifications of cell lysates infected with 
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wHCV39 (type lb), wHCV40 (type lb), wHCV62 (type 3a), and wHCV63 (type 5a). 
Figure 23 shows the profiles obtained from the values shown in Figure 22. These results 
show that the lectin affinity column can be employed for envelope proteins of the different 
types of HCV. 

5.3. Concentration and partial reduction 

The El - or E2-positive fractions were pooled and concentrated on a Centricon 30 
kDa (Amicon) by centrifugation for 3 hours at 5,000 rpm in a Beckman JA-20 rotor at 4°C 
In some experiments the El- or E2 -positive fractions were pooled and concentrated by 
nitrogen evaporation. An equivalent of 3.1 0 8 cells was concentrated to approximately 200 
ul. For partial reduction, 30% Empigen-BB (Calbiochem, San Diego, CA, USA) was 
added to this 200 jal to a final concentration of 3.5 %, and 1M DTT in H 2 0 was 
subsequently added to a final concentration of 1.5 to 7.5 mMand incubated for 30 min at 
37 °C. NEM (1M in dimethylsulphoxide) was subsequently added to a final concentration 
of 50 mM and left to react for another 30 min at 37°C to block the free sulphydryl groups. 

5.4. Gel filtration chromatography 

A Superdex-200 HR 10/20 column (Pharmacia) was equilibrated with 3 column 
volumes PBS/3% Empigen-BB. The reduced mixture was injected in a 500 ul sample loop 
of the Smart System (Pharmacia) and PBS/3% Empigen-BB buffer was added for 
gelfiltration. Fractions of 250 pi were collected from V 0 to V t . The fractions were screened 
for the presence of El or E2 protein as described in example 6. 

Figure 24 shows ELISA results obtained, from fractions obtained after gelfiltration 
chromatography of 4 different El purifications of cell ly sates. infected with wHCV39 
(type lb), vvHCV40 (type lb), wHCV62 (type 3a), and wHCV63 (type 5a). Figure 25 
shows the profiles obtained from purifications of El proteins of types lb, 3a, and 5a (from 
RK13 cells infected with wHCV39, wHCV62, and wHCV63, respectively; purified on 
lentil lectin and reduced as in the previous examples). The peaks indicated with '1', '2', and 
'3', represent pure El protein peaks (El reactivity mainly in fractions 26 to 30). These 
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peaks show very similar molecular weights of approximately 70 kDa ; corresponding to 
dimeric El protein. Other peaks in the three profiles represent vaccinia virus and/or cellular 
proteins which could be separated from El only because of the reduction step as outlined 
in example 5.3. and because of the subsequent gelfiltration step in the presence of the 
proper detergent. As shown in Figure 26 pool 1 (representing fractions 10 to 17) and pool 2 
(representing fractions 1 8 to 25) contain contaminating proteins not present in the El pool 
(fractions 26 to 30). The El peak fractions were ran on SDS/PAGE and blotted as 
described in example 4. Proteins labelled with NEM-biotin were detected by streptavidin- 
alkaline phosphatase as shown in Figure 27. It can be readily observed that, amongst 
others, the 29 kDa and 45kDa contaminating proteins present before the gelfiltration 
chromatography (lane 1) are only present at very low levels in the fractions 26 to 30. The 
band at approximately 65kDa represents the El dimeric form that could not be entirely 
disrupted into the monomeric El form. Similar results were obtained for the type 3a El 
protein (lanes TO to 15), which shows a faster mobility on SDS/PAGE because of the 
presence of only 5 carbohydrates instead of 6. Figure 28 shows a silver stain of an 
SDS/PAGE gel run in identical conditions as in Figure 26. A complete overview of the 
purification procedure is given in Figure 29. 

The presence of purified El protein was further confirmed by means of western 
blotting as described in example 4. The dimeric El protein appeared to be non-aggregated 
and free of contaminants. The subtype lb El protein purified from.wHCV40-infected 
cells according to the above scheme was ammotenninally sequenced on an 477 Perkins- 
Elmer sequencer and appeared to contain a tyrosine as first residue. This confirmed that the 
El protein had been cleaved by the signal peptidase at the correct position (between A 191 
and Y 192) from its signal sequence. This confirms the finding of Hijikata et al. (1991) that 
the annnoterminus of the mature El protein starts at amino acid position 192. 

5.5. Purification of the E2 protein 

The E2 protein (amino acids 384 to 673) was purified from RK13 cells infected 
with wHCV44 as indicated in Examples 5. 1 to 5.4. Figure 30 shows the OD^o profile 
(continuous line) of the lentil lectin chromatography. The dotted line represents the E2 
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reactivity as detected by ELISA (see example 6). Figure 31 shows the same profiles 
obtained from gelfiltratipn chromatography of the lentil-lecun E2 pool (see Figure 30), part 
of which was reduced and blocked according to the methods as set out in example 5.3., and 
part of which was immediately applied to the column. Both parts of the E2 pool were run 
on separate gelfiltration columns. It could be demonstrated that E2 forms covalently-linked 
aggregates with contaminating proteins if no reduction has been performed. After 
reduction and blocking, the majority of contaminating proteins segregated into the V 0 
fraction. Other contaminating proteins copurified with the E2 protein, were not covalently 
linked to the E2 protein any more because these contaminants could be removed in a 
subsequent step. Figure 32 shows an additional Ni 2+ -IMAC purification step carried out for 
the E2 protein purification. This affinity purification step employs the 6 histidine residues 
added to the E2 protein as expressed from wHCV44. Contaminating proteins either run 
through the column or can be removed by a 30 mM imidazole wash. Figure 33 shows a 
silver-stained SDS/PAGE of 0.5 ug of purified E2 protein and a 30 mM imidazole wash. 
The pure E2 protein could be easily recovered by a 200 mM imidazole elution step. Figure 
34 shows an additional desalting step intended to remove imidazole and to be able to 
switch to the desired buffer, e.g. PBS, carbonate buffer, saline. 

Starting from about 50,000 cm 2 of RK13 cells infected with wHCVl 1A (or 
vvHCV40) for the production of El or wHCV41, wHCV42, wHCV43, or wHCV44 for 
production of E2 protein, the procedures described in examples 5.1 to 5.5 allow the 
purification of approximately 1.3 mg of El protein and 0.6 mg of E2 protein. 

It should also be remarked that secreted E2 protein (constituting approximately 30- 
40%, 60-70% being in the intracellular form) is chracterized by aggregate formation 
(contrary to expectations). The same problem is thus posed to purify secreted E2. The 
secreted E2 can be purified as disclosed above. 

Example 6: ELISA for the detec tion of anti-El or anti-E2 antibodies or for the 
detection of El or E2 proteins 



Maxisorb microwell plates (Nunc, Roskilde, Denmark) were coated with 1 volume 
(e.g. 50 u.1 or 100 ul or 200 jj.1) per well of a 5 ug/ml solution of Streptavidin (Boehringer 
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Mannheim) in PBS for 16 hours at 4°C or for 1 hour at 37°C. Alternatively, the wells were 
coated with 1 volume of 5 ug/ml of Galanthus nivalis agglutinin (GNA) in 50 mM sodium 
carbonate buffer pH 9.6 for 16 hours at 4°C or for 1 hour at 37°C. In the case of coating 
with GNA, the plates were washed 2 times with 400 ui of Washing Solution of the Innotest 
5 HCV Ab III kit (Innogenetics, Zwijndrecht, Belgium). Unbound coating surfaces were 
blocked with 1.5 to 2 volumes of blocking solution (0.1% casein and 0.1% NaN 3 in PBS) 
for 1 hour at 37°C or for 16 hours at 4°C. Blocking solution was aspirated. Purified El or 
E2 was diluted to 100-1000 ng/ml (concentration measured at A = 280 run) or column 
fractions to be screened for El or E2 (see example 5), or El or E2 in non-purified cell 

1 0 lysates (example 5. 1 .). were diluted 20 times in blocking solution, and 1 volume of the El 
or E2 solution was added to each well and incubated for 1 hour at 37°C on the 
Streptavidin- or GNA-coated plates. The microwells were washed 3 times with 1 volume 
of Washing Solution of the Innotest HCV Ab III kit (Innogenetics, Zwijndrecht, Belgium). 
Serum samples were diluted 20 times or monoclonal anti-El or anti-E2 antibodies were 

1 5 diluted to a concentration of 20 ng/ml in Sample Diluent of the Innotest HCV Ab HI kit 
and 1 volume of the solution was left to react with the El or E2 protein for 1 hour at 37°C. 
The microwells were washed 5 times with 400 ul of Washing Solution of the Innotest 
HCV Ab HI kit (Innogenetics, Zwijndrecht, Belgium). The bound antibodies were detected 
by incubating each well for 1 hour at 37°C with a goat anti-human or anti-mouse IgG, 

20 peroxidase-conjugated secondary antibody (DAKO, Glostrup, Denmark) diluted 1/80,000 
in 1 volume of Conjugate Diluent of the Innotest HCV Ab III kit (Innogenetics, 
Zwijndrecht, Belgium), and color development was obtained by addition of substrate of the 
Innotest HCV Ab III kit (Innogenetics, Zwijndrecht, Belgium) diluted 100 times in 1 
volume of Substrate Solution of the Innotest HCV Ab III kit (Innogenetics, Zwijndrecht, 

25 Belgium) for 30 min at 24°C after washing of the plates 3 times with 400 fal of Washing 
Solution of the Innotest HCV Ab HI kit (Innogenetics, Zwijndrecht, Belgium). 
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Example 7: Follow up of patient ff rnn PS with different clinical p rofit 

7.1. Monitoring of anti-El and anri-E2 antibodies 



The current hepatitis C virus (HCV) diagnostic assays have been developed for 
screening and confirmation of the presence of HCV antibodies. Such assays do not seem to 
provide information useful for monitoring of treatment or for prognosis of the outcome of 
disease. However, as is the case for hepatitis B, detection and quantification of anti- 
envelope antibodies may prove more useful in a clinical setting. To investigate the 
possibility of the use of anti-El antibody titer and anti-E2 antibody titer as prognostic 
markers for outcome of hepatitis C disease, a series of IFN-cc treated patients with long- 
term sustained response (defined as patients with normal transaminase levels and negative 
HCV-RNA test (PCR in the 5' non-coding region) in the blood for a period of at least 1 
year after treatment) was compared with patients showing no response or showing 
.biochemical response with relapse at the end of treatment. 

A group of 8 IFN-cc treated patients with long-term sustained response (TTR, 
follow up 1 to 3.5 years, 3 type 3a and 5 type lb) was compared with 9 patients showing 
non-complete responses to treatment (NR, follow up 1 to 4 years, 6 type lb and 3 type 3a). 
Type lb (wHCV-39, see example 2.5.) and 3a El (wHCV-62, see example 2.5.) proteins 
were expressed by the vaccinia virus system (see examples 3 and 4) and purified to 
homogeneity (example 5). The samples derived from patients infected with a type lb 
hepatitis C virus were tested for reactivity with purified type lb El protein, while samples 
of a type 3 a infection were tested for reactivity of anti-type 3a E 1 antibodies in an ELISA 
as desribed in example 6. The genotypes of hepatitis C viruses infecting the different 
patients were determined by means of the Inno-LiPA genotyping assay (Innogenetics, 
Zwijndrecht, Belgium). Figure 5 shows theanti-El signal-to-noise ratios of these patients 
followed during the course of interferon treatment and during the follow-up period after 
treatment. LTR cases consistently showed rapidly declining anti-El levels (with complete 
negativation in 3 cases), while anti-El levels of NR cases remained approximately 
constant. Some of the obtained anti-El data are shown in Table 2 as average S/N ratios " 



- 56 - 



SD (mean anti-El titer). The anti-El titer could be deduced from the signal to noise ratio as 
show in Figures 5, 6, 7, and 8. 

Already at the end of treatment, marked differences could be observed between the 
2 groups. Anti-El antibody titers had decreased 6.9 times in LTR but only 1.5 times in NR. 
At the end of follow up, the anti-El titers had declined by a factor of 22.5 in the patients 
with sustained response and even slightly increased in NR. Therefore, based on these data, 
decrease of anti-El antibody levels during monitoring of IFN-a therapy correlates with 
long-term, sustained response to treatment The anti-El assay may be very useful for 
prognosis of long-term response to IFN treatment, or to treatment of the hepatitis C disease 
in general. 

This finding was not expected. On the contrary, the inventors had expected the anti- 
E 1 antibody levels to increase during the course of IFN treatment in patients with long 
term response. As is the case for hepatitis B, the virus is cleared as a consequence of the 
seroconversion for anti-HBsAg antibodies. Also in many other virus infections, the virus is 
eliminated when anti-envelope antibodies are raised. However, in the experiments of the 
present invention, anti-El antibodies clearly decreased in patients with a long-term 
response to treatment, while the antibody-level remained approximately at the same level 
in non-responding patients. Although the outcome of these experiments was not expected, 
this non-obvious finding may be very important and useful for clinical diagnosis of HCV 
infections. As shown in Figures 9, 10, 1 l,.and 12, anti-E2 levels behaved very differently 
in the same patients studied and no obvious decline in titers was observed as for anti-El 
antibodies. Figure 35 gives a complete overview of the pilot study. 

As can be deduced from Table 2, the anti-El titers were on average at least 2 times 
higher at the start of treatment in long term responders compared with incomplete 
responders to treatment. Therefore, measuring the titer of anti-El antibodies at the start of 
treatment, or monitoring the patient during the course of infection and measuring the anti- 
El titer, may become a useful marker for clinical diagnosis of hepatitis C. Furthermore, the 
use of more defined regions of the El or E2 proteins may become desirable, as shown in 
example 7.3. 



- 57 - 



7.2. Analysis of El and E2 antibodies in a larger patient cohort 

The pilot study lead the inventors to conclude that, in case infection was 
completely cleared, antibodies to the HCV envelope proteins changed more rapidly than 
antibodies to the more conventionally studied HCV antigens, with El antibodies changing 
most vigorously. We therefore included more type lb and 3a-infected LTR and further 
supplemented the cohort with a matched series of NR, such that both groups included 14 
patients each. Some partial responders (PR) and responders with relapse (RR) were also 
analyzed. 

Figure 36 depicts average El antibody (ElAb) and E2 antibody (E2Ab) levels in 
the LTR and NR groups and Tables 4 and 5 show the statistical analyses. In this larger 
cohort, higher El antibody levels before IFN-a therapy were associated with LTR (P < 
0.03). Since much higher El antibody levels were observed in type 3a-infected patients 
compared with type lb-infected patients (Figure 37), the genotype was taken into account 
(Table 4). Within the type lb-infected group, LTR also had higher El antibody levels than 
NR at the initiation of treatment [P < 0.05]; me^limited number of type 3a-infected NR did 
not allow statistical analysis. 

Of antibody levels monitored in LTR during the 1.5-year follow up period, only El 
antibodies cleared rapidly compared with levels measured at initiation of treatment [P = 
0.0058, end of therapy; P = 0.0047 and P = 0.0051 at 6 and 12 months after therapy, 
respectively]. This clearance remained significant within type 1- or type 3 -infected LTR 
(average P values < 0.05). These data confirmed the initial finding that ElAb levels 
decrease rapidly in the early phase of resolvement. This feature seems to be independent of 
viral genotype. In NR, PR, or RR, no changes in any of the antibodies measured were 
observed throughout the follow up period. In patients who responded favourably to 
treatment with normalization of ALT levels and HCV-RNA negative during treatment, 
there was a marked difference between sustained responders (LTR) and responders with a 
relapse (RR). In contrast to LTR, RR did not show any decreasing El antibody levels, 
indicating the presence of occult HCV infection that could neither be demonstrated by 
PCR or other classical techniques for detection of HCV-RNA, nor by raised ALT levels. 
The minute quantities of viral RNA, still present in the RR group during treatment, seemed 
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to be capable of anti-El B cell stimulation. Anti-El monitoring may therefore not only be 
able to discriminate LTR from NR, but also from RR. 

7.3. Monitoring of antibodies of defined regions of the El protein 

Although the molecular biological approach of identifying HCV antigens resulted 
in unprecedented breakthrough in the development of viral diagnostics, the method of 
immune screening of Xgtl I libraries predominantly yielded linear epitopes dispersed 
throughout the core and non-structural regions, and analysis of the envelope regions had to 
await cloning and expression of the E1/E2 region in mammalian cells. This approach 
sharply contrasts with many other viral infections of which epitopes to the envelope 
regions had already been mapped long before the deciphering of the genomic structure. 
Such epitopes and corresponding antibodies often had neutralizing activity useful for 
vaccine development and/or allowed the development of diagnostic assays with clinical or 
prognostic significance (e.g. antibodies to hepatitis B surface antigen). 
As no HCV vaccines or tests allowing clinical diagnosis and prognosis of hepatitis C 
disease are available today, the characterization of viral envelope regions exposed to 
immune surveillance may significantly contribute to new directions in HCV diagnosis and 
prophylaxis. 

Several 20-mer peptides (Table 3) that overlapped each other by 8 amino acids, 
were synthesized according to a previously described method (EP-A-0 489 968) based on 
the HC-J1 sequence (Okamoto et al., 1990). None of these, except peptide env35 (also 
referred to as El -3 5), was able to detect antibodies in sera of approximately 200 HCV 
cases. Only 2 sera reacted slightly with the env35 peptide. However, by means of the anti- 
El ELISA as described in example 6, it was possible to discover additional epitopes as 
follows: The anti-El ELISA as described in example 6 was modified by mixing 50 fjg/ml 
of El peptide with the 1/20 diluted human serum in sample diluent Figure 13 shows the 
results of reactivity of human sera to the recombinant El (expressed from wHCV-40) 
protein, in the presence of single or of a mixture of El peptides. While only 2% of the sera 
could be detected by means of El peptides coated on strips in a Line Immunoassay format, 
over half of the sera contained anti-El antibodies which could be competed by means of 



- 59 - 



the same peptides, when tested on the recombinant El protein. Some of the murine 
monoclonal antibodies obtained from Balb/C mice after injection with purified El protein 
were subsequently competed for reactivity to El with the single peptides (Figure 14). 
Clearly, the region of env53 contained the predominant epitope, as me addition of env53 
could substantially compete reactivity of several sera with El, and antibodies to the env3 1 
region were also detected. This finding was surprising, since the env53 and env3 1 peptides 
had not shown any reactivity when coated directly to the solid phase. 

Therefore peptides were synthesized using technology described by applicant 
previously (in WO 93/18054). The following peptides were synthesized: 

peptide env35A-biotin 

NH 2 -SNS SEAADMIMHTPGCV-GKbiotin (SEQ ID NO 51) 

spanning amino acids 208 to 227 of the HCV polyprotein in the El region 

peptide biotin-env53 ('epitope A') 

biotin-GG-ITGHRMAWDMMMNWSPTTAL-COOH (SEQ ID NO 52) 
spanning amino acids to 313 of 332 of the HCV polyprotein in the El 
region 

peptide IbEl ('epitope B') 

H 2 N-YEVRNVSGIYHVTNDCSNSSIVYEAADMIMHTPGCGK -biotin 
(SEQ ID NO 53) 

spanning amino acids 1 92 to 228 of the HCV polyprotein in the El region 
and compared with the reactivities of peptides Ela-BB (biotin-GG- 
TPTVATRDGKLPATQLRRHIDLL, SEQ ID NO 54) and Elb-BB (biotin-GG- 
TPTLAARDASVPTTTIRRHVDLL, SEQ ID NO 55) which are derived from the same 
region of sequences of genotype la and lb respectively and which have been described at 
the IXth international virology meeting in Glasgow, 1993 ('epitope C). Reactivity of a 
panel of HCV sera was tested on epitopes A, B and C and epitope B was also compared 
with env35A (of 47 HCV-positive sera, 8 were positive on epitope B and none reacted with 
env35A). Reactivity towards epitopes A, B, and C was tested directly to the biotinylated 
peptides (50 pg/ml) bound to streptavidin-coated plates as described in example 6. Clearly, 
epitopes A and B were most reactive while epitopes C and env35A-biotin were much less 
reactive. The same series of patients that had been monitored for their reactivity towards 
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the complete El protein (example 7.1 .) was tested for reactivity towards epitopes A, B, and 
C. Little reactivity was seen to epitope C, while as shown in Figures 15, 1 6, 17, and 18, 
epitopes A and B reacted with the majority of sera. However, antibodies to the most 
reactive epitope (epitope A) did not seem to predict remission of disease, while the anti- 
IbEl antibodies (epitope B) were present almost exclusively in long term responders at the 
start of IFN treatment. Therefore, anti-lbEl (epitope B) antibodies and anti-env53 (epitope 
A) antibodies could be shown to be useful markers for prognosis of hepatitis C disease. 
The env53 epitope may be advantageously used for the detection of cross-reactive 
antibodies (antibodies that cross-react between major genotypes) and antibodies to the 
env53 region may be very useful for universal El antigen detection in serum or liver tissue. 
Monoclonal antibodies that recognized the env53 region were reacted with a random 
epitope library. In 4 clones that reacted upon immunoscreening with the monoclonal 
antibody 5E1 A10, the sequence -GWD- was present. Because of its analogy with the 
universal HCV sequence present in all HCV variants in the env53 region, the sequence 
A WD is thought to contain the essential sequence of the env53 cross-reactive murine 
epitope. The env3 1 clearly also contains a variable region which may contain, an epitope in 
the amino terminal sequence -YQVRNSTGL- (SEQ ID NO 93) and may be useful for 
diagnosis. Env31 or El -3 1 as shown in Table 3, is a part of the peptide IbEl. Peptides El- 
33 and El -51 also reacted to some extent with the murine antibodies, and peptide El-55 
(containing the variable region 6 (V6); spanning amino acid positions 329-336) also 
reacted with some of the patient sera. 

Anti-E2 antibodies clearly followed a different pattern than the anti-El antibodies, 
especially in patients with a long-term response to treatment. Therefore, it is clear that the 
decrease in anti-envelope antibodies could not be measured as efficiently with an assay 
employing a recombinant E1/E2 protein as with a single anti-El or anti-E2 protein. The 
anti-E2 response would clearly blur the anti-El response in an assay measuring both kinds 
of antibodies at the same time. Therefore, the ability to test anti-envelope antibodies to the 
single El and E2 proteins, was shown to be useful. 
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7.4. Mapping of anti-E2 antibodies 

Of the 24 anti-E2 Mabs only three could be competed for reactivity to recombinant 
E2 by peptides, two of which reacted with the HVRJ region (peptides E2-67 and E2-69, 
designated as epitope A) and one which recognized an epitope competed by peptide E2- 
13B (epitope C). The majority of murine antibodies recognized conformational anti-E2 
epitopes (Figure 1 9). A human response to HVRI (epitope A), and to a lesser extent HVRII 
(epitope B) and a third linear epitope region (competed by peptides E2-23, E2-25 or E2-27, 
designated epitope E) arid a fourth linear epitope region (competed by peptide E2-17B, 
epitope D) could also frequently be observed, but the majority of sera reacted with 
conformational epitopes (Figure 20). These conformational epitopes could be grouped 
according to their relative positions as follows: the IgG antibodies in the supernatant of 
hybridomas 15C8C1, 12D11F1, 9G3E6, 8G10D1H9, 10D3C4, 4H6B2, 17F2C2, 5H6A7, 
1 5B7A2 recognizing conformational epitopes were purified by means of protein A affinity 
chromatography and 1 mg/ml of the resulting IgG's were biotinylated in borate buffer in 
the presence of biotin. Biotinylated antibodies were separated from free biotin by means of 
gelfiltration chromatography. Pooled biotinylated antibody fractions were diluted 100 to 
10,000 times. E2 protein bound to the solid phase was detected by the biotinylated IgG in 
the presence of 1 00 times the amount of non-biotinylated competing antibody and 
subsequently detected by alkaline phosphatase labeled streptavidin. 

Percentages of competition are given in Table 6. Based on these results, 4 
conformational anti-E2 epitope. regions (epitopes F, G, H and I) could be delineated 
(Figure 38). Alternatively, these Mabs may recognize mutant linear epitopes not 
represented by the peptides used in this study. Mabs 4H6B2 and 10D3C4 competed 
reactivity of 16A6E7, but unlike 16A6E7, they did not recognize peptide E2-13B. These 
Mabs may recognize variants of the same linear epitope (epitope C) or recognize a 
conformational epitope which is sterically hindered or changes conformation after binding 
of 1 6A6E7 to the E2- 1 3 B region (epitope H). 
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Example 8: El glycosylation mutants 

8.1 . Introduction 

The El protein encoded by wHCVlOA, and the E2 protein encoded by wHCV41 
to 44 expressed from mammalian cells contain 6 and 1 1 carbohydrate moieties, 
respectively. This could be shown by incubating the lysate of wHCVl OA-infected or 
vvHCV44-infected RK13 cells with decreasing concentrations of glycosidases (PNGase F 
or Endoglycosidase H, (Boehringer Mannhein Biochemica) according to the 
manufacturer's instructions), such that the proteins in the lysate (including El) are partially 
deglycosylated (Fig. 39 and 40, respectively). 

Mutants devoid of some of their glycosylation sites could allow the selection of 
envelope proteins with improved immunological reactivity. For HIV for example, gpl20 
proteins lacking certain selected sugar-addition motifs, have been found to be particularly 
useful for diagnostic or vaccine purpose. The addition of a new oligosaccharide side chain 
in the hemagglutinin protein of an escape mutant of the A/Hong Kong/3/68 (H3N2) 
influenza virus prevents reactivity with a neutralizing monoclonal antibody (Skehel et al, 
1984). When novel glycosylation sites were introduced into the influenza hemaglutinin 
protein by site-specific mutagenesis, dramatic antigenic changes were observed, suggesting 
that the carbohydrates serve as a modulator of antigenicity (Gallagher et al., 1988). In 
another analysis, the 8 carbohydrate-addition motifs of the surface protein gp70 of the 
Friend Murine Leukemia Virus were deleted. Although seven of the mutations did not 
affect virus infectivity, mutation of the fourth glycosylation signal with respect to the 
amino terminus resulted in a non-infectious phenotype (Kayman et al., 1991). Furthermore, 
it is known in the art that addition of N-linked carbohydrate chains is important for 
stabilization of folding intermediates and thus for efficient folding, prevention of 
malfolding and degradation in the endoplasmic reticulum, oligomerization, biological 
activity, and transport of glycoproteins (see reviews by Rose et al., 1988; Doms et al., 
1993; Heleni us, 1994). 

After alignment of the different envelope protein sequences of HCV genotypes, it 
may be inferred that not all 6 glycosylation sites on the HCV subtype lb El protein are 
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required for proper folding and reactivity, since some are absent in certain (sub)types. The 
fourth carbohydrate motif (on Asn25 1 ), present in types lb, 6a, 7, 8, and 9, is absent in all 
other types know today. This sugar-addition motif may be mutated to yield a type lb El 
protein with improved reactivity. Also the type 2b sequences show an extra glycosylation 
site in the V5 region (on Asn299). The isolate S83, belonging to genotype 2c, even lacks 
the first carbohydrate motif in the V i region (on Asn), while it is present on all other 
isolates (Stuyver et al, 1994) However, even among the completely conserved sugar- 
addition motifs, the presence of the carbohydrate may not be required for folding, but may 
have a role in evasion of immune surveillance. Therefore, identification of the 
carbohydrate addition motifs which are not required for proper folding (and reactivity) is 
not obvious, and each mutant has to be analyzed and tested for reactivity. Mutagenesis of a 
glycosylation motif (NXS or NXT sequences) can be achieved by either mutating the 
codons for N, S, or T, in such a way that these codons encode amino acids different from N 
in the case of N, and/or amino acids different from S or T in the case of S and in the case of 
T. Alternatively, the X position may be mutated into P, since it is known that NPS or NPT 
are not frequently modified with carbohydrates. After establishing which carbohydrate- 
addition motifs are required for folding and/or reactivity and which are not, combinations 
of such mutations may be made. 

8.2. Mutagenesis of the El protein 

All mutations were performed on the El sequence of clone HCCllOA (SEQ ID 
NO. 5). The first round of PCR was performed using sense primer 'GPP (see Table 7) 
targetting the GPT sequence located upstream of the vaccinia 1 IK late promoter, and an 
antisense primer (designated GLY#, with # representing the number of the glycosylation 
site, see Fig. 41) containing the desired base change to obtain the mutagenesis. The six 
GLY# primers (each specific for a given glycosylation site) were designed such that: 
- Modification of the codon encoding for the N-giycosylated Asn (AAC or AAT) to a Gin 
codon (CAA or CAG). Glutamine was chosen because it is very similar to asparagine (both 
amino acids are neutral and contain non-polar residues, glutamine has a longer side chain 
(one more -CH 2 - group). 
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- The introduction of silent mutations in one or several of the codons downstream of the 
glycosylation site, in order to create a new unique or rare (e.g. a second Smal site for 
ElGly5) restriction eixzyme site. Without modifying the amino acid sequence, this 
mutation will provide a way to distinguish the mutated sequences from the original El 

5 sequence (pvHCV-lOA) or from each other (Figure 41) . This additional restriction site 
may also be useful for the construction of new hybrid (double, triple, etc.) glycosylation 
mutants. 

- 18 nucleotides extend 5' of the first mismatched nucleotide and 12 to 16 nucleotides 
extend to the 3' end. Table 7 depicts the sequences of the six GLY# primers overlapping 

10 the sequence of N-linked glycosylation sites. 

For site-directed mutagenesis, the 'mispriming' or 'overlap extension' (Horton, 
1993) was used. The concept is illustrated in Figures 42 and 43. First, two separate 
fragments were amplified from the target gene for each mutated site. The PCR product 
obtained from the 5' end (product GLY#) was amplified with the 5' sense GPT primer (see 

15 Table 7) and with the respective 3' antisense GLY# primers. The second fragment (product 
OVR#) was amplified with the 3' antisense TK R primer and the respective 5' sense primers 
(OVR# primers, see Table 7, Figure 43). 

The OVR# primers target part of the GLY# primer sequence. Therefore, the two 
groups of PCR products share an overlap region of identical sequence. When these 

20 intermediate products are mixed (GLY- 1 with OVR-1 , GLY-2 with OVR-2, etc.), melted 
at high temperature, and reannealed, the top sense strand of product GLY# can anneal to 
the antisense strand of product OVR# (and vice versa) in such a way that the two strands 
act as primers for one another (see Fig. 42.B.). Extension of the annealed overlap by Taq 
polymerase during two PCR cycles created the full-length mutant molecule ElGly#, which 

25 carries the mutation destroying the glycosylation site number #. Sufficient quantities of the 
E1GLY# products for cloning were generated in a third PCR by means of a common set of 
two internal nested primers. These two new primers are respectively overlapping the 3' end 
of the vaccinia 1 IK promoter (sense GPT-2 primer) and the 5' end of the vaccinia 
thymidine kinase locus (antisense TK R -2 primer, see Table 7). All PCR conditions were 

30 performed as described in Stuyver et al. (1993). 
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Each of these PCR products was cloned by EcoRI/BamHI cleavage into the 
EcoRI/BamHI-cut vaccinia vector containing the original El sequence (pvHCV-lOA). 

The selected clones were analyzed for length of insert by EcoRI/BamH I cleavage 
and for the presence of each new restriction site. The sequences overlapping the mutated 
sites were confirmed by double-stranded sequencing. 

8.3. Analysis of El glycosylation mutants 

Starting from the 6 plasmids containing the mutant El sequences as described in 
example 8.2, recombinant vaccinia viruses were generated by recombination with wt 
vaccinia virus as described in example 2.5. Briefly, 175 cm 2 -flasks of subconfluent RK13 
cells were infected with the 6 recombinant vaccinia viruses carrying the mutant El 
sequences, as well as with the wHCV-lOA (carrying the non-mutated El sequence) and wt 
vaccinia viruses. Cells were lysed after 24 hours of infection and analyzed on western blot 
as described in example 4 (see Figure 44A). All mutants showed a faster mobility 
(corresponding to a smaller molecular weight of approximately 2 to 3 kDa) on SDS-PAGE 
than the original El protein; confirming that one carbohydrate moiety was not added. 
Recombinant viruses were also analyzed by PCR and restriction enzyme analysis to 
confirm the identity of the different mutants. Figure 44B shows that all mutants (as shown 
in Figure 41) contained the expected additional restriction sites. Another part of the cell 
lysate was used to test the reactivity of the different mutant by ELISA. The lysates were 
diluted 20 times and added to microwell plates coated with the lectin GNA as described in 
example 6. Captured (mutant) El glycoproteins were left to react with 20-times diluted 
sera of 24 HCV-infected patients as described in example 6. Signal to noise (S/N) values 
(OD of GLY#/OD of wt) for the six mutants and El are shown in Table 8. The table also 
shows the ratios between S/N values of GLY# and El proteins. It should be understood 
that the approach to use cell lysates of the different mutants for comparison of reactivity 
with patient sera may result in observations that are the consequence of different 
expression levels rather then reactivity levels. Such difficulties can be overcome by 
purification of the different mutants as described in example 5, and by testing identical 
quantities of all the different El proteins. However, the results shown in table 5 already 
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indicate that removal of the 1st (GLY1), 3rd (GLY3), and 6th (GLY6) glycosylation motifs 
reduces reactivity of some sera, while removal of the 2nd and 5th site does not. Removal of 
GLY4 seems to improve the reactivity of certain sera. These data indicate that different ■ 
patients react differently to the glycosylation mutants of the present invention. Thus, such 
mutant El proteins may be useful for the diagnosis (screening, confirmation, prognosis, 
etc.) and prevention of HCV disease. 

Example 9: Expression of HCV E2 protein in glycosylation-deficient yeasts 

The E2 sequence corresponding to clone HCCL41 was provided with the a-mating 
factor pre/pro signal sequence, inserted in a yeast expression vector and S. cerevisiae cells 
transformed with this construct secreted E2 protein into the growth medium. It was 
observed that most glycosylation sites were modified with high-mannose type 
glycosylations upon expression of such a construct in S. cerevisiae strains (Figure 45). This 
resulted in a too high level of heterogeneity and in shielding of reactivity, which is not 
desirable for either vaccine or diagnostic purposes. To overcome this problem, 
cerevisiae mutants with modified glycosylation pathways were generated by means of 
selection of vanadate-resistant clones. Such clones were analyzed for modified 
glycosylation pathways by analysis of the molecular weight and heterogeneity of the 
glycoprotein invertase. This allowed us to identify different glycosylation deficient S. 
cerevisiae mutants. The E2 protein was subsequently expressed in some of the selected 
mutants and left to react with a monoclonal antibody as described in example 7, on western 
blot as described in example 4 (Figure 46). 

Example 10. General utility 

The present results show that not only a good expression system but also a good 
purification protocol are required to reach a high reactivity of the HCV envelope proteins 
with human patient sera. This can be obtained using the proper HCV envelope protein 
expression system and/or purification protocols of the present invention which guarantee 
the conservation of the natural folding of the protein and the purification protocols of the 
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present invention which guarantee the elimination of contaminating proteins and which 
preserve the conformation, and thus the reactivity of the HCV envelope proteins. The 
amounts of purified HCV envelope protein needed for diagnostic screening assays are in 
the range of grams per year. For vaccine purposes, even higher amounts of envelope 
protein would be needed. Therefore, the vaccinia virus system may be used for selecting 
the best expression constructs and for limited upscaling, and large-scale expression and 
purification of single or specific oligomeric envelope proteins containing high-mannose 
carbohydrates may be achieved when expressed from several yeast strains. In the case of 
hepatitis B for example, manufacturing of HBsAg from mammalian cells was much more 
costly compared with yeast-derived hepatitis B vaccines. 

The purification method dislcosed in the present invention may also be used for 
'viral envelope proteins' in general. Examples are those derived from Flaviviruses, the 
newly discovered GB-A, GB-B and GB-C Hepatitis viruses, Pestiviruses (such as Bovine 
viral Diarrhoea Virus (BVDV), Hog Cholera Virus (HCV), Border Disease Virus (BDV)), 
but also less related virusses such as Hepatitis B Virus (mainly for the purification of 
HBsAg). 

The envelope protein purification method of the present invention may be used for 
intra- as well as extracellularly expressed proteins in lower or higher eukaryotic cells or in 
prokaryotes as set out in the detailed description section. 
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Table 1: Recombinant vaccinia plasmids and viruses 



Plasmid name 


Name 


cDNA subclone 
construction 


Length (nt/aa) 


Vector used 
for insertion 


pvHCV-13A 


Els 


EcoR 1 - Hind IE 


472/157 


pgptATA-18 


pvHCV-12A 


Els 


EcoR I - Hind IE 


472/158 


pgptATA-18 


pvHCV-9A 


El 


EcoR I - Hind HI 


631/211 


pgptATA-18 


pvHCV-HA 


Els 


EcoR 1 - Hind HI 


625/207 


pgptATA-18 


pvHCV-17A 


Els 


EcoR I - Hind HI 

-AwSVV^V 4 JL HULL XII 


625/208 


r>pntATA-l 8 


pvHCV-lOA 


El 


EcoR I - Hind III 


783/262 




pvHCV-1 8A 


COREs 


Acc I (Kl) - EcoR I (Kl) 


403/130 


DETDtATA-18 


pvHCV-34 


CORE 


Acc I (Kl) - Fsp I 


595/197 


pgptATA-1 8 


pvHCV-33 


CORE-E1 


Acc I (Kl) 


1150/380 


pgptATA-18 


pvHCV-35 


CORE- 
Elb.his 


EcoR I - BamH I (Kl) 


1032/352 


pMS-66 


pvHCV-36 


CORE- 
Eln.his 


EcoR I - Nco I (Kl) • 


1 106/376 


pMS-66 

1 


pvHCV-37 


E1A 


Xma I - BamH I 


711/239 


pvHCV-lOA 


pvHCV-38 


ElAs 


EcoR I - B-stE II 


553/1 83 


pvHCV-HA 


pvHCV-39 


ElAb 


EcoR I - BamH I 


960/313 


pgsATA-18 


pvHCV-40 


ElAb.his 


EcoR I - BamH I (Kl) 


960/323 


pMS-66 


pvHCV-41 


E2bs 


BamH I (Kl>AlwN I (T4) 


1005/331 


pgsATA-18 


pvHC Y-42 


bzbs.nis 


BamH 1 (Kl)-AJwN I (T4) 


1005/341 


pMS-66 


pvHCV-43 


E2ns 


Nco I (Kl) - AlwN I (T4) 


932/314 


pgsATA-18 


pvHCV-44 


E2ns.his 


Nco I (Kl) - AlwN I (T4) 


932/321 


pMS-66 


pvHCV-62 


Els (type 3a) 


EcoR I- Hind HI 


625/207 


pgsATA-18 


pvHCV-63 


El s (type 5.) 


EcoR I - Hind III 


625/207 


pgsATA-18 


pvHCV-64 . 


E2 


BamH I - Hind III 


1410/463 


pgsATA-18 


pvHCV-65 


E1-E2 


BamH I - Hind HI 


2072/691 


pvHCV-lOA 


pvHCV-66 


CORE-E1-E2 


BamH I - Hind ffl 


2427/809 


pvHCV-33 



nt: nucleotide aa: aminoacid Kl: Klenow DNA Pol filling T4: T4 DNA Pol filling 



Position: aminoacid position in the HCV polyprotein sequence 
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Table 1 - continu ed: Recombinant vaccinia plasmids and viruses 



Plasmid 




HCV cDNA subclone 




Vector 


Name 


Name 


Construction 


Length 
(nt/aa) 


used for 
insertion 


pvHCV-81 


E1*-GLY 1 


EcoRI - BamH I 


783/262 


pvHCV-lOA 


pvHCV-82 


E1*-GLY 2 


EcoRI - BamH I 


783/262 


pvHCV-lOA 


-pvHCV-83 


E1*-GLY3 


EcoRI - BamH I 


783/262 


pvHCV-lOA 


pvHCV-84 


E1*-GLY4 


EcoRI - BamH I 


783/262 ' 


pvHCV-lOA 


pvHCV-85 


E1*-GLY5 


EcoRI - BamH I 


783/262 


pvHCV-lOA 


pvHCV-86 


E1*-GLY 6 


EcoRI - BamH I 


783/262 


pvHCV-lOA 



nt: nucleotide aa: aminoacid Kl: Klenow DNA Pol filling T4: T4 DNA Pol filling 

Position: aminoacid position in the HCV polyprotein sequence 
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Table 2 : Summary of anti-El tests 
S/N ± SD (mean anti-El titer) 





Start of treatment 


End of treatment 


Follow-up 


LTR 


6.94 ±2.29 (1:3946) 


4.48 + 2.69(1:568) 


2.99 ±2.69 (1:175) 


NR. 


5.77 + 3.77 (1:1607) 


5.29 + 3.99(1:1060) 


6.08 + 3.73 (1:1978) 



LTR : Long-term, sustained response for more than 1 year 
NR : No response, response with relapse, or partial response 
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Table 3 

Synthetic peptides for competition studies 



PEPTIDE 




POSITION 


SEQ ID NO 


El -31 


LLSCLTVPASAYQVRNSTGL 


181-200 


56 


El -33 


QVRNSTGLYHVTNDCPNSSI 


193-212 


57 


El-35 


NDCPNS SI V YEAHD AILHTP 


205-224 


58 


E1-35A 


SNSSIVYEAADMIMHTPGCV 


208-227 


59 


El-37 


HD AILHTP GCVPCVREGNVS 


217-236 


60 


El-39 


GVREGNVSRCWVAMTPTVAT 


229-248 


61 


El-41 


AMTPTVATRDGKLPATQLRP. 


241-260 


62 


El -43 


LPATQLRRHIDLLVGSATLC 


253-272 


63 


El -45 


LVGSATLCSALYVGDLCGSV 


265-284 


64 


El -49 


QLFTFSPRRHWTTQGCNCSI 


289-308 


65 


El -51 


TQGCNCS I YPGHITGHRMAW 


301-320 


66 


El-53 


ITGHRMAWDMMMNWSPTAAL 


313-332 


67 


El-55 


NWSPTAALVMAQLLRTPQAI 


325-344 


68 


El-57 


LLRIPQAILDM1AGAHWGVL 


' 337-356 


69 


El-59 


AGAHWGVLAGIAYFSMVGNM 


349-368 


70 


El -63 


WLLLFAGVDAETIVSGGQA 


373-392 


71 
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E2 



E2-67 


SGLVSLFTPGAKONIOL TNT 


1 Q7 4 1 f. 


11 


E2-69 


ONIOI rNTNd^WT^TNT^T AT XT 


/I AO 400 


73 


E2-S3B 


j— 'J. iv^n L/uL/i\ i v_r w vv i v ti ,i i-^Jrl is- 


/1T7 A A £L 


74 


E2-S1B 


AGLIYOHKFN^SnPPF'RT A<? 


/] Q Q /(CO 


75 


E2-1B 




/I C 1 /1 1C\ 

4 j i-4 /(J 


76 


E2-3B 


TDFDOGWfrPT ^ v A Krr^<5 r;pnn 

*■ i-'v^vJ vv vjr lo I /\IN uo vJfi w 


4o_)-4oz 


77 


E2-5B 


ANGSOP^^PPV^*w^^^vT>Pl^p^ , 


/ITC A CiA 

4 / j-4y4 


78 


E2-7B 


vv 1 1 x xx xvr \_-Vjfl V r AKO V vJ± V 


4o /oUo 


79 


E2-9B 


AK 1 SVPrTPVYPFTP<;pv\r\/'nT 

rilY ^J v \_v_jx v x v_«r IijF V V VAJTI 


/I OO ci Q 


o0 


E2-1 IB 


PSPVWGTTDT? A PTY<? Wfi 

i ui_ y v v v_J x X UIVOUA.r 1 I j w 


c i i con 


o J 


E2-13B 


GAPTYSWPrFMDTriVFVT >JMT 

vj/tj. x x O vv VJONL/ J. V 17 V i^lN IN 1 


coo C/l-) 


iiz 


E2-17B 


vjn vvi uv^ i vv iviiNo 1 KJr I rv V v^VJ/-Y 


C/1-7 c</r 
J4/-JDO 


83 


E2-19B 


GFTTCvrr; a pp vptoo a <tkt\tt 

vJi x xv v v^umr V v- i w U/i vj I N i N 1 


CCQ C "70 


84 


E2-21 


TrrOArT>vrMTT HfPTnrup t^t-to 
i ^JVj^vvjiNiN I l^ni^r 1 i_j"w-r lvxvri.r 


C71 con 


85 


E2-23 




JOJ-OU2 


86 




i> KCCjbCrP WITPRCL VD YP YR 


595-614 


87 


E2-27 


CLVDYPYRLWHYPCTINYTI 


607-626 


88 


E2-29 


PCTINYTIFKIRMYVGGVEH 


619-638 


89 


E2-31 


MYVGGVEHRLEAACNWTPGE 


631-650 


90 


E2-33 


ACNWTPGERCDLEDRDRSEL 


643-662 


91 


E2-35 


EDRDRSELSPLLLTTTQWQV 


655-674 


92 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Innogenetics N.V. 

(B) STREET: Industriepark Zwijnaarde 7 Bus 4 

(C) CITY: Gent 

(E) COUNTRY: Belgium 

(F) POSTAL CODE (ZIP) : 9052 

(G) TELEPHONE: 00-32-09.241.07.11 

(H) TELEFAX: 00-32-09.241.07.99 

(ii) TITLE OF INVENTION: Purified hepatitis C virus envelope 
proteins for diagnostic and therapeutic use. 

(iii) NUMBER OF SEQUENCES: 111 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM : PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 

<v) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/EP/95/03031 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETI CAL : NO 
(iii) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRI PTION : SEQ ID NO: 1: 
GGCATGCAAG CTTAATTAAT T 21 
(2)" INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CCGGGGAGGC CTGCACGTGA TCGAGGGCAG ACACCATCAC CACCATCACT AATAGTTAAT 



50 
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TAACTGCA 68 
(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA ' 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 

<ix) FEATURE: .. . 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..639 

(ix) FEATURE: 

(A) NAME /KEY : mat_peptide 
(B-) LOCATION: 1..63 6 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTA CTG TCC TGT 4 8 

Met Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu Leu Ser Cys 
15 10 15 

CTG ACC ATT CCA GCT TCC GCT TAT GAG GTG CGC AAC GTG TCC GGG ATG 96 
Leu Thr He Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Met 
20 25 30 

TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA 14 4 

Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala 
35 40 45 

GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT CGG GAG 192 
Ala Asp Met He Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu 
50 55 60 

AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG CTC GCA GCT 24 0 

Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala 
65 70 75 80 

AGG AAC GCC AGC GTC CCC ACC ACG ACA ATA CGA CGC CAC GTC GAT TTG . . 28 8 

Arg Asn Ala Ser val Pro Thr Thr Thr lie Arg Arg His Val Asp Leu 
85 90 95 

CTC GTT GGG GCG GCT GCT . CTC TGT TCC GCT ATG. TAC GTG GGG . GAT CTC 3 36 

Leu Val Gly Ala Ala Ala Leu Cys Ser Ala Met Tyr Val Gly Asp Leu 
100 105 HO 

TGC GGA TCT GTC TTC CTC GTC TCC CAG CTG TTC ACC ATC TCG CCT CGC 3 34 

Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr lie Ser Pro Arg 
115 120 125 

CGG CAT GAG ACG GTG CAG GAC TGC AAT TGC TCA ATC TAT CCC GGC CAC 4 32 

Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His 
130 135 140 

ATA ACA GGT CAC CGT ATG GCT TGG GAT ATG ATG ATG AAC TGG TCG CCT 48C 
He Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp Ser Pro 
145 150 155 "-60 



ACA ACG GCC CTG GTG GTA TCG CAG CTG CTC CGG ATC CCA CAA GCT CTC 



i28 
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Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg lie Pro Gin Ala Val 
165 170 175 

GTG GAC ATG GTG GCG GGG GCC CAT TGG GGA GTC CTG GCG GGC CTC GCC 

Val Asp Met Val Ala Gly Ala His Trp Gly val Leu Ala Gly Leu Ala 

180 * 185 190 



CTC TTT GCT CTC TAATAG 
Leu Phe Ala Leu 
210 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS .- 

(A) LENGTH: 212 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys 
1 5 10 15 

Leu Thr lie Pro Ala Ser Ala . Tyr Glu Val Arg Asn Val Ser Gly Met 
20 25 30 

Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser lie -Val Tvr Glu Ala 
35 . 40- 4 5 

Ala Asp Met lie Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu 
50 55 " 60 

Asn Asn Ser Ser Arg Cys Trp- Val Ala Leu Thr Pro Thr Leu Ala Ala 
65 70 75 80 

Arg Asn Ala Ser Val Pro Thr Thr Thr lie Arg Arg His Val Asd Leu 
85 90 " 95 

Leu Val Gly Ala Ala Ala Leu Cys Ser Ala Met Tyr Val Gly Asp Leu 
100 105 110 

Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr lie Ser Pro Arg 
115 120 125 

Arg His Glu Thr Val Gln'Asp Cys Asn Cys Ser lie Tyr Pro Glv His 
130- 135 * - 140 

lie Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trc Ser ?ro 
145 150 155 150 

Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg He Pro Gin Ala Val 
165 170 " 175 

Val Asp Met Val Ala Gly Ala His Trp Gly val Leu Ala Gly Leu .-.la 
180 18 5 190 

Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val Leu He Val Met Leu 
195 * 200 205 

Leu Phe Ala Leu 
210 



57S 



TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT GTG ATG CTA G 24 
Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val Leu He Val Met Leu 
195 200 205 



42 
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(2) INFORMATION FOR SEQ ID NO: S: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 795 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

tii) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . .792 

(ix) FEATURE: 

(A) NAME /KEY : mat_peptide 

(B) LOCATION: 1..78 9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATG TTG GGT AAG GTC ATC GAT ACC CTT ACA TGC GGC TTC GCC GAC CTC 4 8 

Met Leu Gly'Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1 5 10 * 15 

GTG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG 96 
Val Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 30 

GCC CTG GCG CAT GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA 144 
Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
35 40 45 

ACA GGG AAT TTG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG 192 
Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu 
50 55 60 

CTG TCC TGT CTG ACC GTT CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG 2 40 

Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
65 70 75. 80 

TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 2S8 
Ser Gly Met Tyr His Val Thr Asn Aso Cys Ser Asn Ser Ser He Val 
85 90 95 

TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC 3 3S 

Tyr Glu Ala Ala Asp Met He Met His Thr Pro Gly Cvs Val Pro Cys 
100 105 * no 

GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 3 34 

Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 
115 120 • . 125 

CTC GCA GCT AGG AAC GCC AGC GTC CCC ACC ACG. ACA ATA CGA CGC CAC 4 3 2 

Leu Ala. Ala Arg Asn Ala Ser Val Pro Thr- Thr Thr He Arg Arg His 
130 135 140 

GTC GAT TTG CTC GTT GGG GCG GCT GCT TTC TGT TCC GCT ATG TAC GTG 4 5 0 

Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val 
145 150 155 " 160 

GGG GAC CTC TGC GGA TCT GTC TTC CTC GTC TCC CAG CTG TTC ACC ATC S28 
Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr He 
165 170 175 



- 86 



TCG CCT CGC 
Ser Pro Arg 



CCC GGC CAC 
Pro Gly His 
195 

TGG TCG CCT 
Trp Ser Pro 
210 

CAA GCT GTC 
Gin Ala Val 
225 

GGT CTC GCC 
Gly Leu Ala 



GTG ATG CTA 
Val Met Leu 



CGG 
Arg 
160 

ATA 
lie 



ACA 
Thr 



GTG 
Val 



TAC 
Tyr 



CTC 
Leu 
250 



CAT GAG 
His Glu 



ACG GGT 
Thr Gly 



ACG GCC 
Thr Ala 



GAC ATG 
Asp Met 
230 

TAT TCC 
Tyr Ser 
245 

TTT GCT 

Phe Ala 



ACG GTG 
Thr Val 



CAC CGT 
His Arg 
200 

CTG GTG 
Leu Val 
215 

GTG GCG 
Val Ala 



CAG 
Gin 
185 

ATG 
Met 



GTA 
Val 



GGG 
Gly 



ATG GTG GGG 
Met Val Gly 



CCC TAATAG 
Pro 



GAC 
Asp 



GCT 
Ala 



TCG 
Ser 



GCC 
Ala 



AAC 
Asn 
250 



TGC AAT 
Cys Asn 



TGG GAT 
Trp Asp 



CAG CTG 
Gin Leu 
220 

CAT TGG 
His Trp 
235 

TGG GCT 
Trp Ala 



TGC TCA 
Cys Ser 
190 

ATG ATG 
Met Met 
205 

CTC CGG 
Leu Arg 



GGA GTC 
Gly Val 



AAG GTT 
Lys Val 



ATC TAT 
lie Tyr 



ATG AAC 
Met Asn 



ATC CCA 
lie Pro 



CTG GCG 
Leu Ala 
240 

TTG ATT 
Lsu lie 
255 



576 



£24 



672 



720 



768 



795 



• • • 

• • • • 

» • « 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 63 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE . DESCRIPTION: SEQ ID NO: 6: 

Met Leu Gly Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala 
1 5 10 

Val Gly Tyr He Pro Leu Val Gly Ala Pro Lea Gly Glv Ale 
20 * 25 * 30 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn 
35 40 ■ ' 4 5 

Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu 
50 55 6 0 

Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg 
65 70 75 

« 

Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser 
85 90 

Tyr Glu Ala Ala Asp Met He Met His Thr Pro Gly Cys. Val 
100 105 ■ ' 11C 

Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr 

115 120 • . ' 125 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arc 
130 135 140 

Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met 
145 150 155 

Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe 
165 170 



Asd Leu 
15 

Ala Arg 
Tyr Ala 
Ala Leu 



Asn Val 
80 

He Val 
95 

Pro Cys 



Pro Tnr 



Arg His 



Tyr val 

ISO 

Thr He 
175 
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Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser lie Tyr 
ISO 185 190 

Pro Gly His lie Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn 
195 200 205 

Trp Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg lie Pro 
210 215 220 

Gin Ala val Val Asp Mec Val Ala Gly Ala His Trp Gly Val Leu Ala 
225 2 30 235 * 240 

Gly Leu Ala Tyr Tyr Ser Mec Val Gly Asn Trp Ala Lys Val Leu lie 
245 250 255 

Val Met Leu Leu Phe Ala Pro 
260 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 6 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

<ix) FEATURE : 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . .63 0 

(ix) FEATURE: 

(A) NAME /KEY : mat_peptide 

(B ) LOCATION: 1..627 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

ATG TTG GGT AAG GTC ATC GAT ACC CTT ACG TGC GGC TTC GCC GAC CTC 4 8 

Met Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
15 10 15 

ATG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGT GCT GCC AGA 96 
Met Gly Tyr lie Pro Leu Val Gly A>a Pro Leu Gly Gly Ala Ala Arg 
2 0 25 30 

GCC CTG GCG CAT GGC GTC CGG GTT CTG GAA GAC GGC GTG AAC TAT GCA . .14 4 
Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Glv Val Asn Tyr Ala 
35 40 45 

ACA GGG AAT TTG CCT GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTA 192 
Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu 
50 55 60 

CTG TCC TGT CTG ACC ATT CCA GCT TCC GCT TAT GAG GTG CGC AAC GTG 24 0 

Leu Ser Cys Leu Thr He Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
65 70 75 80 

TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 28 8 

Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser He Val 
85 90 95 



- 88 - 



« • • • « 
• • • 



TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC 336 
Tyr Glu Ala Ala Asp Mec lie Met His Thr Pro Gly Cys Val Pro Cys 
100 105 110 

GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 384 
Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 
115 120 125 

CTC GCA GCT AGG AAC GCC AGC GTC CCC ACT ACG ACA ATA CGA CGC CAC 4 32 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg His 
130 " 135 140 

GTC GAT TTG CTC GTT GGG GCG GCT GCT TTC TGT TCC GCT ATG TAC GTG 480 
Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Mec Tyr Val 
145 150 155 160 

GGG GAT CTC TGC GGA TCT GTC TTC CTC GTC TCC CAG CTG TTC ACC ATC 52 8 

Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr lie 
165 170 175 

TCG CCT CGC CGG CAT GAG ACG GTG CAG GAC TGC AAT TGC TCA ATC TAT 576 
Ser Pro Arg Arg Kis Glu Thr Val Gin Asp Cys Asn Cys Ser He Tyr 
180 185 190 

CCC GGC CAC ATA ACA GGT CAC CGT ATG GCT TGG GAT ATG ATG ATG AAC 624 
Pro Gly His He Thr Gly His Arg Met Ala Trp Asp Met Met Mec Asn 
195 200 205 

TGG TAATAG 633 
Trp 

210 

(2) INFORMATIOK FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 209 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Leu Gly Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
15 10 15 

Met Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 30 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
35 ~ 40 45 

Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu 
50 55 60 

Leu Ser Cys Leu Thr He Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 

6 5 .70 75 . 3 0 

Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser He val 
8 5 90 . 9 5 

Tyr Glu Ala Ala Asp Met He Met His Thr Pro Gly Cys Val Pro Cys 
100 105 110 

Val Arg Glu Asn Asn ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 
115 120 125 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg His 
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130 135 140 

Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val 
145 150 155 160 

Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr lie 
165 170 175 

Ser Pro Arg Arg His Glu Thr Val Gin Aso Cys Asn Cys Ser lie Tyr 
180 185 ' 190 

Pro Gly His lie Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn 
195 200 205 

Trp 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 483 base pairs 
<B> TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

( ix) FEATURE : 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..4S0 

(ix) FEATURE: 

(A) NAME/KEY: mat_pepcide 

(B) LOCATION: 1..477 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: " 

ATG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG . GCC CTG CTG TCC TGT 4 8 

Met Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys 
15 10 15 

CTG ACC ATA CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG TCC GGG GTG 96 
Leu Thr He Pro Ala Ser Ala Tyr Glu Val Arg Asn val Ser Gly Val 
20 25 .30 

TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATA GTG TAT GAG GCA 14 4 

Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala 
35 40 45 

GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT CGG GA.G 19 2 

Ala Asp Met He Met His Thr Pro Gly Cys val Pro Cys Val Arg Glu 
50 55 60 

GGC AAC TCC TCC CGT TGC TGG GTG GCG CTC ACT CCC ACG CTC GCG GCC 2 4 0 

Gly Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala 
65 70 75 80 

AGG AAC GCC AGC GTC CCC ACA ACG ACA ATA CGA CGC CAC GTC GAT TTG 2 88 

Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg His Val Asp Leu 
85 90. 95 



CTC GTT GGG GCT GCT GCT TTC TGT TCC GCT ATG TAC GTG GGG GAT CTC 
Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu 



136 
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100 










105 










110 








TGC 

Cys 


GGA 
Gly 


TCT 
Ser 
115 


GTT 
Val 


TTC 
Phe 


CTT 
Leu 


GTT 
Val 


TCC 
Ser 
120 


CAG 
Gin 


CTG 
Leu 


TTC 
Phe 


ACC 
Thr 


TTC 
Phe 
125 


TCA 
Ser 


CCT 
Pro 


CGC 
Arg 


384 


CGG 
Arg 


CAT 
His 
130 


CAA 
Gin 


ACA 

Thr 


GTA 
Val 


CAG 
Gin 


GAC 
Asp 
135 


TGC 

Cys 


AAC 
Asn 


TGC 
Cys 


TCA 

Ser 


ATC 
He 
140 


TAT 
Tyr 


CCC 
Pro 


GGC 
Gly 


CAT 

His 


432 


GTA 
Val 
145 


TCA 
Ser 


GGT 
Gly 


CAC 
His 


CGC 
Arg 


ATG 
Mec 
150 


GCT 
Ala 


TGG 
Trp 


GAT 
Asp 


ATG 
Mec 


ATG 
Mec 
155 


ATG 
Mec 


AAC 
Asn 


TGG 
Trp 


TCC 
Ser 


TAATAG 
ISO 


43 3 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 159 amino acids 
' ( B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Mec Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu Leu Ser Cys 
± 5 10 .15 

Leu Thr He Pro Ala Ser Ala- Tyr Glu Val Arg Asn Val Ser Gly Val 
20 25 30 

Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala 
'35 4 0 45 

A^a Asp Met lie Met His Thr Pro Gly Cys Val Pro Cys val Arg Glu 
50 55 60 

Gly Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala 
65 70 75 80 

Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg His Val Asp Lsu 
85 9° 95 

Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu 
100 105 110 

Cvs Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg 
115 120 125 

Arg His Gin Thr Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His 
130 135 140 

Val Ser Gly His Arg Met Ala Trp Asp Met Mec Mec Asn Trp Ser 
145 150 155 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 80 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 
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(iii) ANTI-SENSE: NO 



(ix) FEATURE : . 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .477 

(ix) FEATURE: 

(A) NAME /KEY : mat_jpeptide 

(B) LOCATION: 1 . .474 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



ATG TCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCC CTG CTG TCC TGT 
Met Ser Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys 
15 10 15 



48 



CTG ACC ATA CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG TCC GGG GTG 96 
Leu Thr lie Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Val 
20 25 30 

TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATA GTG TAT GAG GCA 14 4 

Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala 
35 40 45 

GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT CGG GAG 192 
Ala Asp Met lie Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu 
50 55. 60 

GGC AAC TCC TCC CGT TGC TGG GTG GCG CTC ACT CCC ACG CTC GCG GCC 24 0 

Gly Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala 
65 - 70 75 80 

AGG AAC GCC AGC. GTC CCC ACA ACG ACA ATA CGA CGC CAC GTC GAT TTG 28 8 

Arg Asn Ala Ser : Val Pro Thr Thr Thr lie Arg Arg His Val Asn Leu 
85 90 ~ ' 95 

CTC GTT GGG GCT GCT GCT TTC TGT TCC GCT ATG TAC GTG GGG GAT CTC 3 3 S 

Leu Val Gly Ala. Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asd Leu 
100 105 110 

TGC GGA TCT GTT TTC CTT GTT TCC CAG CTG TTC ACC TTC TCA CCT CGC 38 4 

Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg 
115 120 125 

CGG CAT CAA ACA GTA CAG GAC TGC AAC TGC TCA ATC TAT CCC GGC CAT 43 2 

Arg His Gin Thr Val Gin Asp Cys Asn Cvs Ser lie Tyr Pro Gly His 
130 135 140 

GTA TCA GGT CAC CGC ATG GCT TGG GAT ATG ATG ATG AAC TGG TAATAG 48 0 

Val Ser Gly His Arg Met Ala Trp Asp Mec Met Met Asn Trp 
145 . 150 155 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: • 

(A) LENGTH: i58 amino" acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 12: 

Met Ser Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys 
15 10 IS 



Leu Thr He Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Val 
20 25 30 

Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala 
35 40 45 

Ala Asp Men lie Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu 
50 55 ' 60 

Gly Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala 
65 70 75 80 

Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg His Val Asp Leu 
85 90 " " 95 

Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu 
100 105 " 110 

Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg 
115 120 125 

Arg His Gin Thr Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His 
130 135 140 

Val Ser Gly His Arg Met Ala Trp Asp Met Mac Met Asn Trp 
145 150 155 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 636 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..633 

(ix) FEATURE: 

(A) NAME /KEY : mat_peptide 

(B) LOCATION: 1..630 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATG CTG GGT AAG GCC ATC GAT ACC CTT ACG TGC GGC TTC GCC GAC CTC 
Met Leu Gly Lys Ala lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1 .5 10 15 

GTG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG 
Val Gly Tyr He Pro Le-J Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
2 0 25 3C 

GCC CTG GCG CAT GGC GTC CGG GTT CTG GAA GAC GGC GTG AAC TAT 3CA 
Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
35 40 45 



ACA GGG AAT TTG CCT GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTA 
Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu 
50 55 60 
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CTG TCC TGT CTA ACC ATT CCA GCT TCC GCT TAC GAG GTG CGC AAC GTG 2 4 0 

Leu Ser Cys Leu Thr He Pro Ala Ser Ala Tyr Glu Val Arg Asn Val " 
65 70 75 80 

TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 288 

Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser He val 
85 * 90 95 

TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC 336 

Tyr Glu Ala Ala Asp Met He Met His Thr Pro Gly Cys Val Pro Cys 
100 105 no 

GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 384 

Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 

115 120 * 125 

CTC GCG GCT AGG AAC GCC AGC ATC CCC ACT ACA ACA ATA CGA CGC CAC 432 

Leu Ala Ala Arg Asn Ala Ser He Pro Thr Thr Thr He Arg Arg His 
130 135 140 

GTC GAT TTG CTC GTT GGG GCG GCT GCT TTC TGT TCC GCT ATG TAC GTG 48 0 

Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val 
145 150 155 160 

GGG GAT CTC TGC GGA TCT GTC TTC CTC GTC TCC CAG CTG TTC ACC ATC 52 8 

Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr He 
1S5 170 175 

TCG CCT CGC . CGG CAT GAG ACG GTG CAG GAC TGC AAT TGC TCA ATC TAT 57 S 

Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser He Tyr 
180 185 190 

CCC GGC CAC ATA ACG GGT CAC CGT ATG GCT TGG GAT ATG ATG ATG AAC 62 4 

Pro Gly His He Thr Gly His Arg Met Ala Trp Asp Met Met 1 Met Asn 

195 200 2Q5 

TGG TAC TAATAG 64 0 

Trp Tyr 
210 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 210 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Leu Gly Lys Ala He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1 5 10 15 

val Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 " 30 

Ala Leu Ala His Gly Val Arg Val Leu Glu Aso Glv Val Asn Tyr Ala 
35 40 45 

Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu 
SO 55 60 

Leu Ser Cys Leu Thr He Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
65 70 75 80 

Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser lie Val 
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Tyr Glu Ala Ala 
100 

Val Arg Glu Asn 
115 

Leu Ala Ala Arg 
13 0 

Val Asp Leu Leu 
145 



Gly Asp Leu Cys 



Ser Pro Arg Arg 
180 

Pro Gly His lie 
195 



Trp Tyr 
210 



85 

Asp Met He Met 



Asn Ser Ser Arg 
120 

Asn Ala Ser lie 
135 

Val Gly Ala Ala 
150 

Gly Ser Val Phe 

165 

His Glu Thr Val 



Thr Gly His Arg 
200 



90 

His Thr Pro Gly 
105 

Cys Trp Val Ala 



Pro Thr Thr Thr 
140 

Ala Phe Cys Ser 
155 

Leu Val Ser Gin 
170 

Gin Asp Cys Asn 
185 

Met Ala Trp Asp 



95 

Cys Val Pro Cys 
110 

Leu Thr Pro Thr 
125 

He Arg Arg His 



Ala Met Tyr Val 

160 

Leu Phe Thr He 
175 

Cys Ser He Tyr 
190 



Met Met Met Asn 
205 



(2) INFORMATION FOR SEQ ID NO: IS: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
ATGCCCGGTT GCTCTTTCTC TATCTT 26 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : CDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
ATGTTGGGTA AGGTCATCGA TACCCT 2S 
(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
CTATTAGGAC CAGTTCATCA TCATATCCCA 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
CTATTACCAG TTCATCATCA TATCCCA 
(2). INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
ATACGACGCC ACGTCGATTC CCAGCTGTTC ACCATC 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
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(iii) ANTI -SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GATGGTGAAC AGCTGGGAAT CGACGTGGCG TCGTAT 3S 
(2) INFORMATION FOR SEQ ID NO : 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1. .720 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1..717 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ATG TTG GGT AAG GTC ATC GAT ACC CTT ACA TGC GGC TTC GCC GAC CTC 4 8 

Mec Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1.5 10 15 

GTG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG 96 
Val Gly Tyr lie Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 30 

GCC CTG GCG CAT GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA 144 
Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
3 5 " "40 4 5 

ACA GGG AAT TTG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG 192 
Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu 
50 55 60 

CTG TCC TGT CTG ACC GTT CCA GCT TCC GCT TAT GAA- GTG CGC AAC GTG 24 0 

Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
6 5 7 0 75 3 0 

TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 2S8 
Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser lie Val 
.85 90 95 

TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC 3 36 

Tyr Glu Ala Ala Asp Met He Met His Thr Pro Glv Cys Val Pro Cys 
100 105 110 

GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 3 84 

Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 
115 120' 125 

CTC GCA GCT AGG AAC GCC AGC GTC CCC ACC ACG ACA ATA CGA CGC CAC 4 32 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arc Arg His 
130 ~ 135 140 
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GTC GAT TCC CAG CTG TTC ACC ATC TCG CCT CGC CGG CAT GAG ACG GTG 
val Asp Ser Gin Leu Phe Thr lie Ser Pro Arg Arc His Glu Thr Val 
145 150 1S5 160 

CAG GAC TGC AAT TGC TCA ATC TAT CCC GGC CAC ATA ACG GGT CAC CGT 
Gin Asp Cys Asn Cys Ser lie Tyr Pro Gly His lie Thr Gly His Arg 
l g 5 170 175 

ATG GCT TGG GAT ATG ATG ATG AAC TGG TCG CCT ACA ACG GCC CTG GTG 
Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val 
180 185 120 

GTA TCG CAG CTG CTC CGG ATC CCA CAA GCT GTC GTG GAC ATG GTG GCG 
Val Ser Gin Leu Leu Arg lie Pro Gin Ala Val Val Asp Met Val Ala 
195 200 205 

GGG GCC CAT TGG GGA GTC CTG GCG GGT CTC GCC TAC TAT TCC ATG GTG 
Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val 
210 215 * 220 

GGG AAC TGG GCT AAG GTT TTG ATT GTG ATG CTA CTC TTT GCT CCC T^ATAG 
Gly Asn Trp Ala Lys Val Leu lie Val Mat Leu Leu 'Phe Ala Pro 
225 230 235 240 



480 



528 



576 



624 



672 



723 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: protein 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1 5 10 15 

Val Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 30 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Glv Val Asn Tyr Ala 
35 4 0 "45 

Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu 
SO 55 60 

Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
6 5 7 0 75 8 0 

Ser Gly Met Tyr His Val Thr Asn Asp cys Set Ash Sar Ser He Val 
85 90 95 

Tyr Glu Ala Ala Asp Met He Met His Thr Pro Gly Cvs Val Pro Cys 
100 105 'llO 

Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 
115 120 125 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg His 
130 135 140 ■ 

Val Asp Ser Gin Leu Phe Thr He Ser Pro Arg Arg His Glu Thr Val 
145 iso 155 ISO 

Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg 
165 170 ' 175 
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Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val 
180 185 190 

Val Ser Gin Leu Leu Arg lie Pro Gin Ala Val Val Asp Mec val Ala 
195 200 205 

Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Mec Val 
210 215 220 



Gly Asn Trp Ala Lys Val Leu lie Val Mec Leu Leu Phe Ala Pro 
225 230 235 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 561 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..558 

<ix) FEATURE : 

(A) NAME /KEY : mac_peptide 

(B) LOCATION: 1..555 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

ATG TTG GGT AAG GTC ATC GAT ACC CTT ACA TGC GGC TTC GCC GAC CTC 4 8 

Met Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Aso Leu 

15 10 15 " 

GTG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG 9 6 

Val Gly Tyr lie Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 30 

GCC CTG GCG CAT GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA 14 4 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
35 40 45 

ACA GGG AAT TTG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG 192 
Thr Glv Asn Leu Pro Glv Cys Ser Phe Ser lie Phe Leu Leu Ala leu 
SO " 55 6 0 

CTG TCC TGT CTG ACC GTT CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG 24 0 

Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val ■ 
65 70 75 B0 

TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 288 
Ser Gly Mec. Tyr His val Thr Asn Asp Cys Ser Asn Ser Ser lie Val 
85 90 95 

TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC 3 3 6 

Tyr Glu Ala Ala Asp Mec He Mec His Thr Pro Gly Cys Val Pro Cys 
100 105 110 

GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 3 34 

Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 
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115 120 125 

CTC GCA GCT AGG AAC GCC AGC GTC CCC ACC ACG ACA ATA CGA CGC CAC 
Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg His 
130 13S 140 

GTC GAT TCC CAG CTG TTC ACC ATC TCG CCT CGC CGG CAT GAG ACG GTG 
Val Asp Ser Gin Leu Phe Thr lie Ser Pro Arg Arg His Glu Thr Val 
145 150 1S5 160 

CAG GAC TGC AAT TGC TCA ATC TAT CCC GGC CAC ATA ACG GGT CAC CGT 
Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg 
165 170 175 

ATG GCT TGG GAT ATG ATG ATG AAC TGG TAATAG 
Met Ala Trp Asp Met Met Met Asn Trp 
180 18 5 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 185 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 24: 

Met Leu Gly Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1 5 10 15 

Val Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 30 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Glv Val Asn Tyr V a 
3 5 40 45 

Thr Gly Asn Leu Pro Gly Cys Ser Phe. Ser He Phe Leu Leu Ala Leu 
50 55 60 

Leu Ser Cys Leu Thr val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
65 70 75 SO 

Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser lie Val 
85 90 95 

Tyr Glu Ala Ala Asp Met He Met His Thr Pro Gly Cys Val Pro Cys 
100 105 no 

Val Arg Glu Asn Asn Ser Ser Arg Cys Tro Val Ala Leu Thr Pro Thr 
H5 120 125 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Tar Thr He Arg Ara -is 
130 135 i4 0 

Val Asp Ser Gin Leu Phe Thr He Ser Pro Arg Arg His Glu Thr Val 
I 45 ' 150 155 ~ ISO 

Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His He Thr Gly His ^rg 
165 170 175 

Met Ala Trp Asp Met Met Met Asn Trp 
180 185 

(2) INFORMATION FOR SEQ ID NO: 2 5:. 



432 



480 



52£ 



561 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 605 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . G03 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1 . . S00 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

ATG TTG GGT AAG GTC ATC GAT ACC CTT ACA TGC GGC TTC GCC GAC CTC 4 8 

Met Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1*5 10 15 

GTG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG 9 6 

val Gly Tyr lie Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 30 

GCC CTG GCG CAT GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA 14 4 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
35 40 45 

ACA GGG AAT TTG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG 192 
Thr Gly Asn Leu Pro Glv Cys Ser Phe Ser lie Phe Leu Leu Ala Leu 
50 55 60 

CTG TCC TGT CTG ACC GTT CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG 24 0 

Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
65 70 75 60 

TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 28 8 

Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser lie Val 
85 ' 90 95 

TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC 336 
Tyr Glu Ala Ala Asp Met lie Met His Thr Pro Gly Cys Val Pro Cys 
100 ' 105 110 

GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 384 
Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 
115 120 125 

CTC GCA GCT AGG AAC GCC AGC GTC CCC ACC ACG ACA ATA CGA CGC CAC 4 32 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr lie Arg Arg His 
130 135 140 

GTC GAT TCC CAG CTG TTC ACC ATC TCG CCT CGC CGG CAT GAG ACG GTG 4 S 0 

Val Asp Ser Gin Leu Phe Thr He Ser Pro Arg Arg His Glu Thr Val 
145 150 155 ISO " 

CAG GAC TGC AAT TGC TCA ATC TAT CCC GGC CAC ATA ACG GGT CAC CGT 528 
Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg 
165 170 175 

ATG GCT TGG GAT ATG ATG ATG AAC TGG TCG CCT ACA ACG GCC CTG GTG 57 6 
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GTA TCG CAG CTG CTC CGG ATC CTC TAATAG 
Val Ser Gin Leu Leu Arg He Leu 
"5 " 200 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 200 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Leu Gly Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
15 10 i 5 

Val Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 30 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
35 40 45 

Thr Gly Asn Leu Pro Gly Cys' Ser Phe Ser He Phe Leu Leu Ala Leu 
50 55 60 

Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
63 7 ° 75 BO 

Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Se- lie Val 
SS -90 95 

Tyr Glu Ala Ala Asp Met lie Met His Thr Pro Gly Cys Val o ro cys 
100 105 no 

Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr P^-o Thr 
115 120 125 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg His 
130 135 140 

Val Asp Ser Gin Leu Phe Thr He Ser Pro Arg Arg His Glu Thr Vai 
145 150 1S5 15 3 

Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His He Thr Glv His Arg 
1S 5 170 175 

Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val 
180 IBS 

Val Ser Gin Leu Leu Arc He Leu 
195 " 200 



190 



(2) INFORMATION FOR SEQ' ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 636 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY ; linear 



(ii) MOLECULE TYPE : cDNA 



♦ • • • 

♦ • fl 

♦ • • • 
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(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..633 

(ix) FEATURE: 

(A) NAME /KEY : mat_pept ide 

(B) LOCATION: 1..S30 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

ATG TTG GGT AAG GTC ATC GAT ACC CTT ACA TGC GGC TTC GCC GAC CTC 48 
Met Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1 5 10 15 

GTG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG 9S 
Val Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 30 

GCC CTG GCG CAT GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA 144 
Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
35 4 0 45 

ACA GGG AAT TTG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG 192 
Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu 
50 55 60 

CTG TCC TGT CTG ACC GTT CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG 24 0 

Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
65 70 75 80 

TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 2 88 

Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser He Val 
85 90 95 

TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC 3 36 

Tyr Glu Ala Ala Asp Mec He Met His Thr Pro Gly Cys Val Pro Cys 
100 105 110 

GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 3 84 

Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 
115 120 " 125 

CTC GCA GCT AGG AAC GCC AGC GTC CCC ACC ACG ACA ATA CGA CGC CAC 4 32 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg His 
130 135 140 

GTC GAT TCC CAG CTG TTC ACC ATC TCG CCT CGC CGG CAT GAG ACG GTG 4 8 0 

Val Asp Ser Gin Leu Phe Thr He Ser Pro Arg Arg His Glu Thr Val 
145 150 155 160 

CAG GAC TGC AAT TGC TCA ATC TAT CCC GGC CAC ATA ACG GGT CAC CGT 52 8 

Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg 
165 170 " 175 

ATG GCT TGG GAT ATG ATG ATG AAC TGG TCG CCT ACA ACG GCC CTG GTG 57 6 

Met Ala Trp Asp Met Mec Met Asn Trp Ser Pro Thr Thr Ala Leu Val 
180 185 190 

GTA TCG CAG CTG CTC CGG ATC GTG ATC GAG GGC AGA CAC CAT CAC CAC 6 24 

Val Ser Gin Leu Leu Arg He Val He Glu Gly Arg His His His His 
195 200 20S 

CAT CAC TAATAG 6 36 



His His 
210 
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(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 210 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Met Leu Gly Lys val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1 5 10 15 

Val Gly Tyr lie Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arq 
20 25 30 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ma 
35 40 45 

Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu 
50 55 60 

Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
65 70 75 ' 80 

Ser Gly Mec Tyr His Val Thr Asn Asd Cys Ser Asn Ser Ser He Val 
85 90 95 

Tyr Glu Ala Ala Asp Met He Met His Thr Pro Gly Cvs Val Pro Cys 
100 105 ' no 

Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro ^hr 
115 120 ' 125 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg -is 
130 135 140 

Val Asp Ser Gin Leu Phe Thr He Ser Pro Arg Arg Kis Glu Thr Val 
145 ISO 155 160 

Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg 
165 170 175 

Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr. Ala Leu Val 
180 185 190 

Val Ser Gin Leu Leu Arg He Val He Glu Gly Arg His His His - ; s 
195 200 205 

His His' 
210 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 630 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 



(iii) ANTI-SENSE: NO 
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(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..627 

(ix) FEATURE: 

(A) NAME/KEY: mat_pept ide 

(B) LOCATION: 1..624 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATG GGT AAG GTC ATC GAT ACC CTT ACG TGC GGA TTC GCC GAT CTC ATG 48 
Met Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Me: 
1 5 10 15 

GGG TAC ATC CCG CTC GTC GGC GCT CCC GTA GGA GGC GTC GCA AGA GCC 9 6 

Gly Tyr lie Pro Leu Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala 
20 25 30 

CTT GCG CAT GGC GTG AGG GCC CTT GAA GAC GGG ATA AAT TTC GCA ACA 14 4 

Leu Ala His Gly Val Arg Ala Leu Glu Asp Gly lie Asn Phe Ala Thr 
35 " 40 45 

GGG AAT TTG CCC GGT TGC TCC TTT TCT ATT TTC CTT CTC GCT CTG TTC 192 
Glv Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe 
50 55 60 

TCT TGC TTA ATT CAT CCA GCA GCT AGT CTA GAG TGG CGG AAT ACG TCT 2 4 0 

Ser Cys Leu He His Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser 
65 70 75 80 

GGC CTC TAT GTC CTT ACC AAC GAC TGT TCC AAT AGC AGT ATT GTG TAC 28 8 

Gly Leu Tyr Val Leu Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr 
85 90 95 

GAG GCC GAT GAC GTT ATT CTG CAC ACA CCC GGC TGC ATA CCT TGT GTC 33 S 

Glu Ala Asp Asp Val He Leu His Thr Pro Gly Cys He Pro Cys Val 
100 105 110 

CAG GAC GGC AAT ACA TCC ACG TGC TGG ACC CCA GTG ACA CCT ACA GTG 38 4 

Gin Asp Gly Asn Thr Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val 
115' 120 125 

GCA GTC AAG TAC GTC GGA GCA ACC ACC GCT TCG ATA CGC AGT CAT GTG 4 32 

Ala Val Lys Tyr Val Gly Ala Thr Thr Ala Ser He Arg Ser His Val 
130 * 135 140 

GAC CTA TTA GTG GGC GCG GCC ACG ATG TGC TCT GCG CTC TAC GTG GGT 4 80 

Asp Leu Leu Val Gly Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly 
145 150 155 160 

GAC ATG TGT GGG GCT GTC TTC CTC GTG GGA CAA GCC TTC ACG TTC AGA 52 B 

Asp Met Cys Gly Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg 
165 170 175 

CCT CGT CGC CAT CAA ACG GTC CAG ACC TGT AAC TGC TCG CTG TAC CCA = 7S 
Pro Arg Arg His Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro 
180 185 190 

GGC CAT CTT TCA GGA CAT CGA ATG GCT TGG GAT ATG ATG ATG AAC TGG s2< 
Gly His Leu Ser Gly His Arg Mec Ala Trp Asp Met Met Met Asn Trp 
195 * 200 205 

TAATAG 534 
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(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 208 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Mec Gly Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met 
1 5 10 15 

Gly Tyr He Pro Leu Val Gly Ala Pro Val Gly Gly val Ala Arg Ala 
20 25 30 

Leu Ala His Gly Vel Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr 
35 40 45 

Gly Asn Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe 
50 55 60 

Ser Cys Leu He His Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser 
65 70 75 50 

Gly Leu Tyr Val Leu Thr Asn Asp Cys Ser Asn Ser Ser lis Val Tvr 
85 90 95 

Glu Ala Asp Asp Val He Leu His Thr Pro Gly Cys He Pro Cys Val 
100 105 * " 110 

Gin Asp Gly Asn Thr Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val 
115 120 * 12 5 

Ala Val Lys Tyr Val Gly Ala Thr Thr Ala Ser lie Arg Ser His Val 
130 135 140 

Asd Leu Leu Val Gly Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Glv 
145 150 155 150 

Asp Met Cys Gly Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg 
165 170 175 

Pro Arg Arg His Gin Thr Val Gin Thr Cys Asn Cvs Ser Leu Tyr Pro 
180 185 * " ISO 

Gly His Leu Ser Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 
195 " 200 " 205 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: G30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO 



(iii) ANTI-SENSE: NO 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..S27 

(ix) FEATURE : 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1..624 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

ATG GGT AAG GTC ATC GAT ACC CTA ACG TGC GGA TTC GCC GAT CTC ATG 4 8 

Mec Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met 
15 10 15 

GGG TAT ATC CCG CTC GTA GGC GGC CCC ATT GGG GGC GTC GCA AGG GCT 96 
Gly Tyr lie Pro Leu Val Gly Gly Pro lie Gly Glv Val Ala Arg Ala 
20 25 30 

CTC GCA CAC GGT GTG AGG GTC CTT GAG GAC GGG GTA AAC TAT GCA ACA 14 4 

Leu Ala His Gly Val Arg Val Leu Glu.Asp Gly val Asn Tyr Ala Thr 
35 4 0 45 

GGG AAT TTA CCC GGT TGC TCT TTC TCT ATC TTT ATT CTT GCT CTT CTC 192 
Gly Asn Leu Pro Gly Cys Ser Phe Ser lie Phe lie Leu Ala Leu Leu 
50 55 60 

TCG TGT CTG ACC GTT CCG GCC TCT GCA GTT CCC TAC CGA AAT GCC TCT 24 0 

Ser Cys Leu Thr Val Pro Ala Ser Ala Val Pro Tyr Arg Asn Ala Ser 
65 70 75 80 

GGG ATT TAT CAT GTT ACC AAT GAT TGC CCA AAC TCT TCC ATA GTC TAT 28 8 

Gly He Tyr His Val Thr Asn Asd Cys Pro Asn Ser Ser He Val Tyr 
85 90 95 

GAG GCA GAT AAC CTG ATC CTA CAC GCA CCT GGT TGC GTG CCT TGT GTC 3 36 

Glu Ala Asp Asn Leu He Leu His Ala Pro Gly Cys Val Pro Cys Val 
100 105 110 

ATG ACA GGT AAT GTG AGT AGA TGC TGG GTC CAA ATT ACC CCT ACA CTG 384 
Met Thr Gly Ash Val Ser Arg Cys Trp Val Gin lie Thr Pro Thr Leu 
115 120 125 

TCA GCC CCG AGC CTC GGA GCA GTC ACG GCT CCT CTT CGG AGA GCC GTT 4 32 

Ser Ala Pro Ser Leu Gly Ala Val Thr Ala Pro Leu Arg Arg Ala Val 
130 " 135 140 

GAC TAC CTA GCG GGA GGG GCT GCC CTC TGC TCC GCG TTA TAC GTA GGA 4B0 
Asp Tyr Leu Ala Gly Gly Ala Ala Leu Cvs Ser Ala Leu Tyr Val C-ly 
. 14-5 ' 150 . 155 160 

GAC GCG TGT GGG GCA CTA TTC TTG GTA GGC CAA ATG TTC ACC TAT AGG 52 3 

Asp Ala Cys Gly Ala Leu Phe Leu Val Gly Gin Mec Phe Thr Tyr Arg 
165 170 175 

CCT CGC CAG CAC GCT ACG GTG CAG AAC TGC AAC TGT TCC ATT TAC AGT 
Pro Arg Gin His Ala Thr Val Gin Asn Cys Asn Cys Ser He Tyr Ser 
180 185 190 

GGC CAT GTT ACC GGC CAC CGG ATG GCA TGG GAT ATG ATG ATG AAC TGG =2 4 

Gly His Val Thr Gly His Arg Met Ala Trp Asp Mec Mec Mec Asn Trp 
195 " 200 205 



= 76 



TAATAG 



530 



(2) INFORMATION FOR SEQ ID NO: 32: 
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<i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 208 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Met Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala- Aso Leu Met 
1 5 10 15 

Gly Tyr lie Pro Leu Val Gly Gly Pro lie Gly Gly Val Ala Arg Ala 
20 25 " 30 

Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr 
35 40 45 

Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe He Leu Ala Leu Leu 
50 55 so 

Ser Cys Leu Thr Val Pro Ala Ser Ala Val Pro Tyr Arg Asn Ala Ser 
65 7 0 75 "~ 80 

Gly He Tyr His Val Thr Asn Asp Cys Pro Asn Ser Ser lie Val Tyr 
85 90 95 

Glu Ala Asp Asn Leu He Leu His Ala Pro Gly Cys Val Pro Cys Val 
100 los " 110 

Met Thr Gly Asn Val Ser Arg Cys Trp Val Gin He Thr Pro Thr Leu 
115 120 125 

Ser Ala Pro Ser Leu Gly Ala Val Thr Ala Pro Leu Arg Arg Ala Val 
130 135 140 

Asp Tyr Leu Ala Gly Gly Ala Ala Leu Cys Ser Ala Leu Tyr Val Gly 
145 150 155 ISO 

Asp Ala Cys Gly Ala Leu Phe Leu Val Gly Gin Met Phe Thr Tvr Arg 
165 170 175 

Pro Arg Gin His Ala Thr Val Gin Asn Cys Asn Cys Ser He Tyr Ser 
ISO 185 190 

Gly His Val Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 
195 200 " 205 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
(B> TYPE: nucleic acid 
( C > STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



9 
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TGGGATATGA TGATGAACTG GTC 

(2) INFORMATION FOR SEQ'ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
CTATTATGGT GGTAAGCCAC AGAGCAGGAG 

30 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS .- 

(A) LENGTH: 14 76 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE : 

(A) NAME/KEY : CDS 

CB) LOCATION: 1..1473 

(ix) FEATURE: 

(A) NAME/KEY: matpeptide 

(B) LOCATION: 1..1470 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

™ GAT ATG ATG ATG AAC TGG TCG CCT ACA ACG GCC CTG GTG GTA TCG 
Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val Ser 
1 5 10 15 

Gl n fjfr ? TC CGG A ?° CCA CAA GCT GTC CT G GAC ATG GTG GCG GGG GCC 
Gin Leu Leu Arg He Pro Gin Ala Val Val Asd Met Val Ala Glv Va 

20 25 30 

nil SI? TP °? G ^ C 0X0 GCC ™C TAT TCC ATG GTG GGG AAC 

His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val G ly Asn 
J = 40 45 

Tro A°I W G SIT T* ST JT? ATG CTA ^ ^ GCC GGC GTC GAC GGG 
Trp Ala Lys Val Leu Val val Met Leu Leu Phe Ala Gly Val Asp Glv 

Hil S 7 ? 1°" GGA GGG GCA GCA GCC TCC GAT ACC AGG GGC CTT 

His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp Thr Arg Glv Leu 

70 75 8 0 

GTG TCC CTC TTT AGC CCC GGG TCG GCT CAG AAA ATC CAG CTC GTA AAC 



48 



L44 



192 



240 



263 



432 



480 



528 



576 
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Val ser Leu Phe Ser Pro Gly Ser Ala Gin Lys n e Gin Leu Val Asn 
85 90 g 5 

Th^ A^n r?5 C GT S AC rT C AGG ACT GCC AAC TGC AAC GAC , , , 

Thr Asn Gly Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp 336 
100 105 ilo 

TCC CTC CAA ACA GGG TTC TTT GCC GCA CTA TTC TAC AAA CAC AAA TTC 3S4 
Ser Leu Gin Thr Gly Phe Phe Ala Ala Leu Phe Tyr Lys His Lvs Phe 84 
liS 120 12 5 . 

AAC TCG TCT GGA TGC CCA GAG CGC TTG GCC AGC TGT CGC TCC ATC GAC 
Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Ser He Asp 

135 14 q 

AAG TTC GCT CAG GGG TGG GGT CCC CTC ACT TAC ACT GAG CCT AAC ^GC 
Lys Phe Ala Gin Gly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Se- 
145 150 " 155 160 

TCG GAC CAG AGG CCC TAC TGC TGG CAC TAC GCG CCT CGA CCG TGT GGT 
Ser Asp Gin Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly 
1SS 170 175 

ATT GTA CCC GCG TCT CAG GTG TGC GGT CCA GTG TAT TGC TTC ACC CCG 
xle Val Pro Ala Ser Gin Val Cys Gly Pro Val Tyr Cys Phe Thr Pro 
180 185 " igo 

AGC CCT GTT GTG GTG GGG ACG ACC GAT CGG TTT GGT GTC CCC ACG TAT 6 24 

Ser Pro Val Val Val Gly , Thr Thr Asp Arg Phe Gly Val Pro Thr Tv- 
195 200 205 

AAC TGG GGG GCG AAC GAC TCG GAT GTG CTG ATT" CTC AAC AAC ACG CGG 6 72 

Asn Trp Gly Ala Asn Asp Ser Asp val Leu lie Leu Asn Asn Thr Aro 
210 215 220 " 

CCG CCG CGA GGC AAC TGG TTC GGC TGT ACA TGG ATG AAT GGC ACT GGG 720 
Pro Pro. Arg Gly Asn Trp Phe Gly Cys Thr . Trp Met Asn Gly Thr Gly 
225 230 235 ■ 240 

TTC ACC AAG ACG TGT GGC- GGC CCC CCG TGC AAC ATC GGG GGG GCC GGC 7 68 

Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn He Gly Gly Ala Gly 
245 250 255 

AAC AAC ACC TTG ACC TGC CCC ACT GAC TGT TTT CGG AAG CAC CCC GAG 8^5 
Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu 
260 265 270 

GCC ACC TAC GCC AGA TGC GGT TCT GGG CCC TGG CTG ACA CCT AGG TGT 8 64 

Ala Thr Tyr Axa Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys 
-275 280 285 

ATG GTT CAT TAC CCA TAT AGG CTC TGG CAC TAC CCC TGC ACT GTC AAC 912 
Met Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn 
290 295. 300 ' 

TTC ACC ATC TTC AAG GTT AGG ATG TAC GTG GGG GGC GTG GAG CAC AGG 96 0 

Phe Thr He Phe Lys Val Arg Met: Tyr Val Gly Gly Val Glu Kis Aro 
305 310 315 320 

TTC GAA GCC GCA TGC AAT TGG ACT CGA GGA GAG CGT TGT GAC TTG GAG 10 08 

Phe Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asd Leu Cu 
325 330 " " 335 

GAC AGG GAT AGA TCA GAG CTT AGC CCG CTG CTG CTG TCT ACA ACA GAG 105 6 

Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Glu 
340 345 350 

TGG CAG ATA CTG CCC TGT TCC TTC ACC ACC CTG CCG GCC CTA TCC A^C 1104 
Trp Gin He Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr 



- 110 - 



• * • 

♦ * » • 

' • * 



355 360 365 

G?v 7£, rTf ^ C 7 TC £ AT A T C GTG GAC GT G CAA TAC CTG TAC 

Gly Leu lie His Leu His Gin Asn lie Val Asp Val Gin Tyr Leu Tyr 

J U 375 380 

PW vTt o CG GCG fTT GTC TCC m ^ ATC ^ TCG GAG TAT GTC 

Gly Val Gly Ser Ala Val Val Ser Leu Val He Lys Trp Glu Tyr Val 

385 39° 395 400 

CTG TTG CTC TTC CTT CTC CTG GCA GAC GCG CGC ATC TGC GCC TGC TTA 
Leu Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg He Cys Ala Cys Leu 
405 410 * 41S 

TGG ATG ATG CTG CTG ATA GCT CAA GCT GAG GCC GCC TTA GAG AAC CTG 
Trp Met Met Leu Leu He Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu 
420 425 430 

GTG GTC CTC AAT GCG GCG GCC GTG GCC GGG GCG CAT GGC ACT CTT TCC 1344 
Val Val Leu Asn Ala Ala Ala Val Ala Gly Ala His Gly Thr Leu Ser 
435 440 445 

TTC CTT GTG TTC TTC TGT GCT GCC TGG TAC ATC AAG GGC AGG CTG GTC 1392 
Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr He Lys Gly Arg Leu Val 
450 455 460 



Trp 
1 


Asp 


Met 


Met 


Met 
5 


Asn 


Trp 


Ser 


Pro 


Thr 
10 


Thr 


Ala 


Leu 


Val 


val 
15 


Ser 


Gin 


Leu 


Leu 


Arg 
• 20 


He 


Pro 


Gin 


Ala 


Val 
25 


Val 


Asp 


Met 


Val 


Ala 
30 


Gly 


Ala 


His 


Trp 


Gly 
35 


Val 


Leu 


Ala 


Gly 


Leu 
40 


Ala 


Tyr 


Tyr 


Ser 


Met 
45 


Val 


Gly 


Asn 


Trp 


Ala 
50 


Lys 


Val 


Leu 


Val 


val 
55 


Met 


Leu 


Leu 


Phe 


Ala 
60 


Gly 


Val 


Asp 


Gly 


His 
65 


Thr 


Arg 


Val 


Ser 


Gly 
70 


Gly 


Ala 


Ala 


Ala 


Ser 
75 


Asp 


Thr 


Arg 


Gly 


Leu 
60 


Val 


Ser 


Leu 


Phe 


Ser 
85 


Pro 


Gly 


Ser 


Ala 


Gin 
90 


Lys 


He 


Gin 


Leu 


Val 
95 


Asn 


Thr 


Asn 


Gly 


Ser 
100 


Trp 


His 


He 


Asn 


Arg 
105 


Thr 


Ala 


Leu 


Asn 


Cys 
110 


Asn 


Asp 


Ser 


Leu 


Gin 
115 


Thr 


Gly 


Phe 


Phe 


Ala 
120 


Ala 


Leu 


Phe 


Tyr 


Lys 
125 


His 


Lys 


Phe 



11S2 



1200 



1248 



129S 



1440 



CCT GGT GCG GCA TAC GCC TTC TAT GGC GTG TGG CCG CTG CTC CTG CTT 
Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro Leu Leu Leu Leu 
465 470 475 480 

CTG CTG GCC TTA CCA CCA CGA GCT TAT GCC TAGTAA 1-175 
Leu Leu Ala Leu Pro Pro Arg Ala Tyr Ala 
485 490 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 90 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE.- protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



- Ill - ■ 

Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Ser lie Asp 
iJU 135 140 

Lys Phe Ala Gin Gly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser 

150 155 160 

Ser Asp Gin Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly 
x 65 ivo 175 r 

lie Val Pro Ala Ser Gin Val Cys Gly Pro Val Tyr Cys . Phe Thr Pro 
180 185 190 

Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr 
195 200 205 

Asn Trp Gly Ala Asn Asp Ser Asp Val Leu lie Leu Asn Asn Thr Arg 



220 



Pro Pro Arg Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr Gly 
25 230 235 2 4o 

Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn lie Gly Gly Ala Gly 
245 250 * 255 

Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu 
260 265 270 

Ala Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys 
275 280 2B5 

Met Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn 
290 295 300 

Phe Thr lie Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arq 
305 3 1° 315 320 

Phe Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asd Leu Glu 
325 330 " "335 

Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Glu 
340 345 350 

Trp Gin lie Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr 
355 360 355 

Gly Leu lie His Leu His Gin Asn He Val Asp Val Gin Tyr Leu Tyr 
370 375 380 

Gly Val Gly Ser Ala Val Val Ser Leu Val lie Lys Trp Glu Tyr Val 

Leu Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg He Cvs Ai a Cys L=u 
405 4I0 " 

Trp Met Met Leu Leu He Ala Gin Ala Glu. Ala Ala Leu Glu Asn L=u 
420 425 430 

Val Val Leu Asn Ala Ala Ala Val Ala Gly Ala His Gly Thr Leu Ser 
435 440 445 

Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr lie Lys Gly Ara Leu Val 
450 4 55 460 

Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro Leu Leu Leu T =u 
465 47 ° " 475 Ho 

Leu Leu Ala Leu Pro Pro Arg Ala Tyr Ala 
485 490 



(2) INFORMATION FOR SEQ ID NO: 37: 
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(i) SEQUENCE CHARACTERISTICS- 

A ^f™' "21 base pairs 
IB) TYPE: nucleic acid 
C) STRANDEDNESS : single 
(DJ TOPOLOGY., linear 

<ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 1018 

(ix) FEATURE: 

<A> NAME/KEY: mat Depcide 

(B) LOCATION: 2 . .1015 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37; 



° ss s s; s sij si? 2? s s? s « - « ™ - 

SI? 2f S Sg 2S S £ ?» e s SI? 35 ES 5? EE 
ST S SI? SI? S Si SI £1 s I s ™ - ™ « « 

J = 40 45 

vl? Jer G?y ^ S £5 ™= ^ ACC err gtg tcc ctc 

50 Sex- Asp Thr Arg Gly Leu Val Ser Leu 

55 60 

«E Ser Pro £J Al a Gin ^ ^ ^° ^ GTA ^ C ACC 

fro oiy se. Ala Gin Lys He Gin Leu Val Asn Thr Asn Gly 

70 75 

Ser S lie £n Arq TnT a?° ^ TGC GAC TCC CAA 

80 *ff Thr Ala Leu ^ sn Cys Asn Asp Ser Leu Gin 

65 9° 95 

£ SS5J5SS s ssfs^^asijjs; is s m 

luu 105 """ 

SS 5? S SS S S S2 SI S KE EjSJ g 2 SI 
S i S? S 2S S S? S SK S-JK K S 2J SS 
25 £§ £ SJ SS SS ^ S SS S S SI S SI 5If S 

s ss as si? gs sj ss si? ^ sz is s s ss s si 



46 



94 



142 



190 



238 



28S 



3 34 



3 32 



4 30 



478 



526 
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GTG GTG GGG 
Val Val Gly 



GCG AAC GAC 
Ala Asn Asp 



GGC AAC TGG 
Gly Asn Trp 
210 

ACG TGT GGG 
Thr Cys Gly 
225 

TTG ACC TGC 
Leu Thr Cys 
240 

GCC AGA TGC 
Ala Arg Cys 



ACG ACC GAT 
Thr Thr Asp 
ISO 

TCG GAT GTG 
Ser Asp Val 
195 

TTC GGC TGT 
Phe Gly Cys 



CGG TTT GGT 
Arg Phe Gly 



CTG 
Leu 



ACA 
Thr 



GGC CCC CCG 
Gly Pro Pro 



TAC CCA TAT 
Tyr Pro Tyr 



CCC ACT GAC 
Pro Thr Asp 
245 

GGT TCT GGG 
Gly Ser Gly 
260 

AGG CTC TGG 
Arg Leu Trp 
275 



TGC 
Cys 
230 

TGT 
Cys 



ATT CTC 
He Leu 
200 

TGG ATG 
Trp Met 
215 

AAC ATC 
Asn He 



GTC CCC ACG 
Val Pro Thr 
185 

AAC AAC ACG 
Asn Asn Thr 



TAT AAC 
Tyr Asn 



AAT GGC ACT 
Asn Gly Thr 



GGG 
Glv 



TTT CGG 
Phe Arg 



CCC TGG CTG 
Pro Trp Leu 



CAC 
His 



TTC 
Phe 



GCA 
Ala 



. AGA 
Arg 
320 

GGC 
Gly 



AAG GTT 
Lys Val 
290 

TGC AAT 
Cys Asn 
305 

TCA GAG 
Ser Glu 



AGG ATG TAC 
Arg Met Tyr 

TGG ACT CGA 
Trp Thr Arg 



CTT AGC CCG 
Leu Ser Pro 
32 5 



GTG 
Val 



GGA 
Gly 
310 

CTG 
Leu 



TAC CCC 
Tyr Pro 
2 80 

GGG GGC 
Glv Gly 
295 

GAG CGT 
Glu Arg 



AAG 
Lys 



ACA 
Thr 
2S5 

TGC 
Cys 



GGG GCC 
Gly Ala 
235 

CAC CCC 
His Pro 
250 

CCT AGG 
Pro Arg 



CGG CCG 
Arg Pro 
205 

GGG TTC 
Gly Phe 
220 

GGC AAC 
Gly Asn 



TGG 
Trp 
190 

CCG 
Pro 



ACC 
Thr 



GGG 

Gly 



CGA 
Arg 



AAG 
Lys 



AAC 
Asn 



GAG GCC ACC 
Glu Ala Thr 



TGT ATG 
Cys Met 



ACT GTC 
Thr Val 



GTG GAG CAC 
Val Glu Kis 



CTG CTG 
Leu Leu 



TGT GAC TTG 
Cys Asp Leu 
315 

TCT ACA ACA 
Ser Thr Thr 
330 



AAC TTC 
Asn Phe 
2 85 

AGG TTC 

Arg Phe 
3 00 

GAG GAC 
Glu Asd 



GTT 
Val 
270 

ACC 
Thr 



ACC 
Thr 



TAC 
Tyr 
255 

CAT 



ATC 
He 



GAA 
Glu 



AGG 
Arg 



GAG TGG CAG 
Glu Trb Gin 



GCC 
Ala 



GAT 
Aso 



AGT 
Ser 
335 



AGA GCT 
Arg Ala 



TAATTA 



574 



622 



670 



718 



766 



814 



862 



510 



95J 



1006 



1021 



(2) INFORMATION FOR SEQ ID NO : 3B: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 338 amino acids 

(B) TYPE:, amino acid 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: -38: 



He 
1 


Pro 


Gin 


Ala 


Val 
5 


Val 


Asp 


Met 


Val 


Ala 
10 


Gly 


Ala 


His 


Trp 


Gly 
15 


Va 1 


Leu 


Ala 


Gly 


Leu 
20 


Ala 


Tyr 


Tyr 


Ser 


Met 

25 


Val 


Gly 


Asn 


Trp 


Ala 
30 


Lys 


Val 


Leu 


Val 


Val 
35 


Met 


Leu 


Leu 


Phe 


Ala 
4 0 


Gly 


Val 


Asp 


Gly 


His 
45 


Thr 


Arg 


Val 


Ser 


Gly 
SO 


Gly 


Ala 


Ala 


Ala 


Ser 
55 


Asp 


Thr 


Arg 


Gly 


Leu 
60 


Val 


Ser 


Leu 


Phe 


Ser 


Pro 


Gly 


Ser 


Ala 


Gin 


Lys 


He 


Gin 


Leu 


Val 


Asn 


Thr 


Asn 


Gly 


Ser 
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65 70 75 80 

Trp His lie Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gin Thr 
85 90 " 95 

Gly Phe Phe Ala Ala Leu Phe Tyr Lys His Lys Phe Asn Ser Ser Gly 
100 105 * 110 

Cys Pro Glu Arg Leu Ala Ser Cys Arg Ser He Asp Lys Phe Ala Gin 
115 120 125 

Gly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser Ser Asp Gin Arg 
130 135 140 

Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala 
145 150 155 160 

Ser Gin Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val 
165 170 175 

Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr Asn Trp Gly Ala 
180 185 190 

Asn Asp Ser Asp Val Leu He Leu Asn Asn Thr Arg Pro Pro Arg Gly 
195 200 205 

Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe Thr Lys Thr 
210 215 220 

Cys Gly Gly Pro Pro Cys Asn He Gly Gly Ala Glv Asn Asn Thr Leu 
225 230 235 * 240 

Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Ala 
245 250 255 

Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Met Val His Tyr 
260 * " 265 * 270 

Pro Tyr* Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He Phe 
275 280 285 

Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg. Phe Glu Ala Ala 
290 295 300 

Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asd Arg Asp Arg 
305 " 310 " " 315 ' " 320 

Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Glu Trp Gin Ser Gly 

325 . 330 " 335 

Arg Ala 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1034 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 



(ix) FEATURE: 



* 
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(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 1032 

(ix) FEATURE: 

(A) NAME /KEY : macpeptide 

(B) LOCATION: 2.. 1029 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



° S S Si S 55 SI? S» *» ™ s S? S SS £ S£ 
SI? S S SS 2S S S? £ E SZ S! o G S ^ S SIS 
°" ™ - S! £f s 2 S S £ S5 25 S? SI JS S 

vll 1^ ^ GCA GCC TCC GAT A CC AGG GGC CTT GTG TCC CTC 

val Ser Gly Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu Val Ser 

55 go 

S ter Pro C^v If? 'ff ^ ^ ATC CAG CTC GTA ^ C AGG AAC GGC 
fne ser Pro Gly Ser Ala Gin Lys He Gin Leu Val Asn Thr Asn Gly 

Sel £^ ° tT C £* C AGG ACT GCC 0X5 TGC ^ GAC TCC CTC CAA 

Ser Trp H ls He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser LeS G^n 

8b go - s 

Th^ Glv Pie HI T CTA TAC CAC ^ " G AAC TCG-TCT 

Thr Gly Phe Phe Ala Ala Leu Phe Tyr Lys His Lys Phe Asn Ser Ser 

100 105 110 

GGA TGC CCA GAG CGC TTG GCC AGC TGT CGC TCC ATC GAC AAG TTr r~r 
Gly Cys Pro Glu Arg, Leu Ala Ser Cys Arg Ser lie So L^s pie All 
115 ' 120 xSs 

St G GGG I 00 GGT CCC "C ACT TAC ACT GAG CCT AAC AGC TCG GAC CAG 
Gin Gly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser ler. Asp Gin 

135- 140 

*™ o CC ^ AC I? C ^ CAC TAC GCG CCT CGA CCG TGT GGT ATT GTA CCC 
Arg Pro Tyr Cys Trp His Tyr. Ala Pro Arg Pro Cys Gly xYe vll Pro 
J '^ 3 150 155 

Ala Sel G In vl? S CA GTG TAT TGC " C ACC GCG AGC CCT GTT 

Ala Se. Gin Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val 

165 170 i75 

vll vll £? v Th^ tk C GAT CGG GGT GTC CCC ACG TAT AAC TGG GGG 

val Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr Asn Tro Glv 

180 165 19 b 

A1 G GAC I CG GAT GTG 070 ATT ^ C AAC ACG CGG CCG CCG CGA 

Ala Asn Asp Ser Asp Val Leu He Leu Asn Asn Thr Arg Pro Pro Art 

195 200 205 

r?5 ^ P" C GGC TGT ACA TGG ATG AAT GGC ACT GGG TTC ACC AAG 

Gly Asn Trp Phe Gly Cys Thr Trp Me. Asn Gly Thr Gly Thr L~vs 

210 215 220 

ACG TGT GGG GGC CCC CCG TGC AAC ATC GGG GGG GCC GGC AAC AAC ACC 
Thr Cys Gly Gly Pro Pro Cys Asn He Gly Gly Ala Gly A^n ten -hr 

230 235 



46 



94 



142 



190 



238 



28S 



334 



382 



430 



478 



526 



574 



622 



670 



718 
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TTG ACC 
Leu Thr 
240 

GCC AGA 
Ala Arg 



TAC CCA 
Tyr Pro 



TTC AAG 
Phe Lys 



GCA 
Ala 



AGA 
Arg 
320 

CAG 

Gin 



TGC 
Cys 
305 

TCA 
Ser 



ACA 
Thr 



TGC CCC 
Cys Pro 



TGC GGT 
Cys Gly 



TAT AGG 
Tyr Arg 
275 

GTT AGG 
Val Arg 
290 

AAT TGG 
Asn Trp 



ACT GAC 
Thr Asp 
245 

TCT GGG 
Ser Gly 
260 

CTC TGG 
Leu Trp 



ATG TAC 
Met Tyr 



ACT CGA 
Thr Arg 



TGT TTT 
Cys Phe 



CCC TGG 
Pro Trp 

CAC TAC 
His Tyr 



GAG CTT 
Glu Leu 



CCA TCA 
Pro Ser 



AGC CCG 
Ser Pro 
325 

CCA CCA 
Pro Pro 
340 



GTG 
Val 



GGA 
Gly 
310 

CTG 
Leu 



GGG 
Gly 
295 

GAG 
Glu 



CTG 
Leu 



CGG AAG 
Arg Lys 



CTG ACA 
Leu Thr 
265 

CCC TGC 
Pro Cys 
280 

GGC GTG 
Gly Val 



CAC CCC GAG GCC 
His Pro Glu Ala 
250 

CCT AGG TGT ATG 
Pro Arg Cys Met 



TCA CTA 
Ser Leu 



CGT TGT 
Arg Cys 



CTG TCT 
Leu Ser 



AT AG 



ACT GTC AAC TTC 
Thr Val Asn Phe 
285 

GAG CAC AGG TTC 
Glu His Arg Phe 
3 00 

GAC TTG GAG GAC 
Asp Leu Glu Asp 
315 

ACA ACA GGT GAT 
Thr Thr Gly Asp 
330 



ACC TAC 
Thr Tyr 
255 

GTT CAT 
Val His 
270 

ACC ATC 
Thr lie 



GAA GCC 
Glu Ala 



AGG GAT 
Arg Asp 



CGA GGG 
Arg Gly 

335 



766 



814 



862 



910 



958 



1006 



1034 



(2) INFORMATION FOR SEQ ID NO: 4 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34.3 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

He Pro Gin Ala Val val Asd Met Val Ala Gly Ala His Trp Gly val 
l 5 " 10 15 

Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn TrD Ala Lys Val 
20 25 "30 

Leu Val Val Met Leu Leu Phe Ala Gly Val Asp Gly His Thr Arg Val 
35 4 0 45 

Ser Gly Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu Val Ser Leu Phe 
50 55 * 60 

Ser Pro Gly Ser Ala Gin Lys He Gin Leu Val Asn Thr Asn Gly Ser 
65 70 75 80 

Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser' Leu Gin Thr 
85 90 95 

Gly Phe Phe Ala Ala Leu Phe Tyr Lys His Lys Phe Asn Ser Ser Gly 
100 105 ' 110 

Cys Pro Glu Arg Leu Ala Ser Cys Arg Ser He Asp Lys Phe Ala Gin 
115 120 125 

Gly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser Ser Asp Gin Arg 
130 135 140 

Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala 
145 150 ' 155 160 
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Ser Gin Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val 
155 170 175 

Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr Asn Trp Gly Ala 
iOO 185 190. 

Asn Asp Ser Asp Val Leu lie Leu Asn Asn Thr Arg Pro Pro Arq Glv 
195 200 ■ 205 

Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe Thr Lys Thr 
210 215 * 220 

Cys Gly Gly Pro Pro Cys Asn lie Gly Gly Ala Gly Asn Asn Thr Leu 
225 2 30 235 240 

Thr Cys Pro Thr Asp Cys Phe Arg Lys His- Pro Glu Ala Thr Tyr Ala 
245 250 255 

Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Met Val His Tyr 
260 265 ' 270 

Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr lie Phe 
275 280 285 

Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Glu Ala Ala 
290 295 300 

Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp Arg 
305 310 . 315 320 

Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Glv Asp Arg Gly Gin 
325 330 " 335 

Thr Pro Ser Pro Pro Ser Leu 
34 0 



(2) INFORMATION FOR SEQ ID NO: 4.1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 945 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNES S;: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 



( i i i ) HYPOTHETI CAL : NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION': 1. . 942 

(ix) FEATURE : 

(A) NAME /KEY: mat_peptide 

(B! LOCATION: 1.. 93 9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 



ATG GTG GGG AAC 
Met Val Gly Asn 
1 

GGC GTC GAC GGG 
Gly Val Asp Gly 



TGG GCT AAG GTT 

Trp Ala Lys Val 
S 

CAT ACC CGC GTG 

His Thr Arg Val 



TTG GTT GTG ATG 

Leu Val Val Met 
10 

TCA GGA GGG GCA 

Ser Gly Gly Ala 



CTA CTC TTT GCC 

Leu Leu Phe Ala 
15 

GCA GCC TCC GAT 

Ala Ala Ser Asp 
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ACC AGG 

v. Thr Arg 



CAG CTC 
Gin Leu 
50 

AAC TGC 
Asn Cys 
65 

AAA CAC 

Lys His 



20 

GGC CTT GTG TCC 
Gly Leu Val Ser 
35 

GTA AAC ACC AAC 
Val Asn Thr Asn 



AAC 
Asn 



AAA 
Lys 



CGC TCC 
Arg Ser 



GAG CCT 
Glu Pro 



CGA 
Arg 

TGC 
Cys 
145 

GTC 
Val 



CCG 
Pro 
130 

TTC 
Phe 



CCC 
Pro 



ATC 
lie 



AAC 
Asn 
115 

TGT 

Cys 



GAC TCC CTC 
Asp Ser Leu 

70 

TTC AAC TCG 
Phe Asn Ser 
85 

GAC AAG TTC 
Asp Lys Phe 
100 

AGC TCG GAC 
Ser Ser Asp 



25 

CTC TTT AGC 
Leu Phe Ser 
40 

GGC AGT TGG 
Gly Ser Trp 
55 

CAA ACA GGG 
Gin Thr Gly 



CCC GGG 
Pro Gly 



CAC ATC 
His He 



TCT GGA TGC 
Ser Gly Cys 



GCT CAG 
Ala Gin 



CAG 
Gin 



GGT ATT GTA 
Gly He Val 



ACC 
Thr 



ACG 
Thr 



CCG 
Pro 



TAT 
Tyr 



AAC AAC 
Asn Asn 



AAT GGC 
Asn Gly 



GGG GGG 
Gly Gly 
210 

AAG CAC 
Lys His 
225 

ACA CCT 
Thr Pro 



ACG 
Thr 



ACT 

Thr 
195 



CGG 
Arg 
180 

GGG 
Gly 



AGC CCT 
Ser Pro 
150 

AAC TGG 
Asn Trp 
165 

CCG CCG 
Pro Pro 



CCC 
Pro 
135 

GTT 
Val 



AGG 
Arg 
120 

GCG 
Ala 



GGG 
Gly 
105 

CCC 
Pro 



TCT 
Ser 



TTC TTT 
Phe Phe 
75 

CCA GAG 
Pro Glu 
90 

TGG GGT 
Trp Gly 



30 

TCG GCT CAG 
Ser Ala Gin 
45 

AAC AGG ACT 
Asn Arg Thr 

60: 

GCC GCA CTA 
Ala Ala Leu 



AAA ATC 
Lys He 



GCC CTG 
Ala Leu 



CGC TTG GCC 
Arg Leu Ala 



TAC TGC 
Tyr Cys 



CAG GTG 
Gin Val 



GTG 

Val 



GTG 
Val 



GGG GCG AAC 
Gly Ala Asn 



TTC ACC 
Phe Thr 



GCC 
Ala 



GGC AAC AAC 
Gly Asn Asn 



CCC 
Pro 



TGC ACT 
Cys Thr 



GTG GAG 
Val Glu 



TGT GAC 
Cys Asp 
2 9*0 



AGG 
Arg 

GTC 
Val 



CAC 
His 
275 

TTG 
Leu 



GAG 
Glu 



TGT 
Cys 



AAC 
Asn 
260 

AGG 
Arg 



GCC ACC 
Ala Thr 
230 

ATG GTT 
Met Val 
245 

TTC ACC 
Phe Thr 



CGA 
Arg 

AAG 
Lys 

ACC 
Thr 
215 

TAC 
Tyr 



GGC 
Gly 



ACG 
Thr 
200 

TTG 
Leu 



AAC 
Asn 
185 

TGT 
Cys 



GGG ACG 
Gly Thr 
155 

GAC TCG 
Asp Ser 

170 

TGG TTC 
Trp Phe 



CCC CTC ACT 
Pro Leu Thr 
110 

TGG CAC TAC 
Trp His Tvr 
125 

TGC GGT CCA 
Cys Gly Pro 
140 

ACC GAT CGG 
Thr Asp Arg 



TTC TAC 
Pha Tyr 
80 

AGC TGT 
Ser Cys 
95 

TAC ACT 
Tyr Thr 



GCG CCT 
Ala Pro 



GTG TAT 
Val Tyr 



TTT GGT 
Phe Gly 
160 



GAT GTG CTG 
Asp Val Leu 



GGC TGT 
Gly Cys 



ACC 
Thr 



GCC AGA 
Ala Arg 



CAT TAC CCA 
His Tyr Pro 



TTC GAA 
Phe Glu 



GAG GAC AGG 
Glu Asp Arg 



ATC TTC AAG 
He Phe Lys 
265 

GCC GCA TGC 
Ala Ala Cys 
280 

GAT AGA TCA 
Asp Arg Ser 
295 



GGG GGC 
Gly Gly 

TGC CCC 
Cys Pro 



TGC GGT 
Cys Gly 
235 

TAT AGG 
Tyr Arg 
25 0 

GTT AGG 
Val Arg 



CCC CCG 
Pro Pro 
205 

ACT GAC 
Thr Asd 
220 

TCT GGG 
Ser Gly 



ACA 
Thr 
190 

TGC 
Cys 



TGT 
Cys 



ATT CTC 

He Leu 
175 

TGG ATG 

Trp Met 



AAC ATC 
Asn He 



:gg 

Phe Arg 



CCC 
Pro 



CTC TGG CAC 
Leu Tro His 



TGG CTG 
Trp Leu 

2 4-0- 

tac rcc 

Tyr = ro 
255 



AAT TGG 
Asn Trp 

GAG CTT 
Glu Leu 



ATG TAC GTG 
Met Tyr Val 
270 

ACT CGA GGA 
Thr Ara Gly 
285 

AGC CCG CTG 
Ser Pro Leu 
300 



GGG 
Glv 



K3C 
Uy 



GAG CST 
Glu Arg 



CTG CTG 
Leu Leu 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



316 



364 



912 
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TCT ACA ACA GAG. TGG CAG AGC TTA ATT AAT TAG 
Ser Thr Thr Glu Trp Gin Ser Leu lie Asn 
305 310 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 314 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Met Val Gly Asn Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ma 
1 5 10 is " 

Gly Val Asp Gly His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp 
20 25 30 

Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gin Lys T1 ~ 
35 40 45 

Gin Leu Val Asn Thr Asn Gly Ser Trp His He Asn Arg Thr Ala Leu 
50 55 60 

Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe Phe. Ala Ala Leu Phe Tvr 

65 7 ° 75 .; 80 

Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys 
8 5 90 95 

Arg Ser He Asp Lys Phe Ala Gin Gly Trp Gly Pro Leu Th- Tyr Thr 
100 105. 11C 

Glu Pro Asn Ser Ser Asp Gin Arg-; Pro Tyr Cys Trc His Tyr Ala Pro 

115 120 - ■ . -125.' 

Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys: G ly Pro Val Tvr 
13 0 135 ,140 " 

Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arc Phe Gly 
145 150 155 " 1S £ 

val Pro Thr Tyr Asn Trp Gly Ala Asn Asp Ser Asp Val Leu He Leu 
165 170 175 

Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp Phe Gly Cys Thr Trp Met 
180 las 190 

Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn He 
195 200 ' 205 

Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys Pro Thr asd Cy= Ph^ Ara 
210 215 220 " 

Lys His Pro Glu Ala Thr Tyr Ala Arg Cys Gly Ser Gly »ro Tro Leu 
225 -230 235 . 240 

Thr Pro Arg Cys Met Val His Tyr Pro Tyr Arg Leu Trp His Tyr =-o 
245 250 255 

Cys Thr val Asn Phe Thr He Phe Lys Val Arg Met Tyr Va 1 Glv Glv 
260 265 27C 

Val Glu His Arg Phe Glu Ala Ala Cys Asn Trp Thr Arg Glv Glu A-e> 
275 280 285 ' 



Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu 
290 295 300 

Ser Thr Thr Glu Trp Gin Ser Leu lie Asn 
305 310 

(2) INFORMATION FOR SEQ ID NO : 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 961 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) .NAME /KEY : CDS 

(B) LOCATION: 1..958 

< ix) FEATURE : 

(A) NAME /KEY : mat_pepcide 

(B) LOCATION: 1 . . 955 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

ATG GTG GGG AAC TGG GCT AAG GTT TTG GTT GTG ATG CTA CTC TTT GCC 
Met Val Gly Asn Trp Ala Lys Val Leu Val Va'l Mec Leu Leu Phe Ala 
1 5 10 15 

GGC GTC GAC GGG CAT ACC CGC GTG TCA GGA GGG GCA GCA GCC TCC GAT 
Gly Val Asp Gly His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp 
20 2 5 3 0 

ACC AGG GGC CTT GTG TCC CTC TTT AGC CCC GGG TCG GCT CAG AAA ATC 
Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gin Lys lie 
35 40 45 

CAG CTC GTA AAC ACC AAC GGC AGT TGG CAC ATC AAC AGG ACT GCC CTG 
Gin Leu Val Asn Thr Asn Gly Ser Trp His lie Asn Arg Thr Ala Leu 
50 55 60 

AAC TGC AAC GAC TCC CTC CAA ACA GGG TTC TTT GCC GCA CTA TTC TAC 
Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe Phe Ala Ala Leu Phe Tyr 
£5 70 75 80 

AAA CAC AAA TTC AAC TCG TCT GGA TGC CCA GAG CGC TTG GCC AGC TGT 
Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys 
85 " 90 " 95 

CGC TCC ATC GAC AAG TTC GCT CAG GGG TGG GGT CCC CTC ACT TAC ACT 
Arg Ser lie Asp Lys Phe Ala Gin Gly Trp Gly Pro Leu Thr Tyr Thr 
100 105 " ll'O 

GAG CCT AAC AGC TCG GAC CAG AGG CCC TAC TGC TGG CAC TAC GCG CCT 
Glu Pro Asn Ser Ser Asp Gin Arg Pro Tyr Cys Trp His Tvr Ala Pro 
US 120 125 

CGA CCG TGT GGT ATT GTA CCC GCG TCT CAG GTG TGC GGT CCA GTG TAT 
Arg Pro Cys Gly lie Val Pro Ala Ser Gin Val Cys Gly Pro Val Tyr 
130 135 140 

TGC TTC ACC CCG AGC CCT GTT GTG GTG GGG ACG ACC GAT CGG TTT GGT 
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Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly 

14b 150 155 " 160 

GTC CCC ACG TAT AAC TGG GGG GCG AAC GAC TCG GAT GTG CTG ATT CTC 

Val Pro Thr Tyr Asn Trp Gly Ala Asn Asp Ser Asp Val Leu lie Leu 

165 170 175 

AAC AAC ACG CGG CCG CCG CGA GGC AAC TGG TTC GGC TGT ACA TGG ATG 

Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp Phe Gly Cys Thr Tro Met 
180 185 190 



TAG 

(2) INFORMATION FOR SEQ ID NO : 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 319 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Met Val Gly Asn Trp Ala Lvs Val Leu Val Val Met Leu Leu Phe A 1 a 
1 5 1C . 15 

Gly Val Asp Gly His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Aso 
20 25 30 

Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gin Lvs lie 
3 5 4 0 4 5 

Gin Leu Val Asn Thr Asn Gly Ser Trp His He Asn Arg Thr Ala L»u 
50 55 60 



52B 



576 



AAT GGC ACT GGG TTC ACC AAG ACG TGT GGG GGC CCC CCG TGC AAC ATC 6 24 

Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn He 
195 200 205 

GGG GGG GCC GGC AAC AAC ACC TTG ACC TGC CCC ACT GAC TGT TTT CGG 6 72 

Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys Pro Thr Aso Cys Phe Arq 
210 215 220 * 

AAG CAC CCC GAG GCC ACC TAC GCC AGA TGC GGT TCT GGG CCC TGG CTG 72 0 

Lys His Pro Glu Ala Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp Leu 
225 230 235 240 

ACA CCT AGG TGT ATG GTT CAT TAC CCA TAT AGG CTC TGC- CAC TAC CCC 7 68 

Thr Pro Arg Cys Met Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro 
24 5 250 ~ " 255 

TGC ACT GTC AAC TTC ACC ATC TTC AAG GTT AGG ATG TAC GTG. GGG GGC 816 
Cys Thr Val Asn Phe Thr He Phe Lys Val Arg Met Tyr Val Gly Gly 
260 265 270 

GTG GAG CAC AGG TTC GAA GCC GCA TGC AAT TGG ACT CGA. GGA GAG CGT 864 
Val Glu His Arg Phe Glu Ala Ala Cys Ash Trp . Thr Arg Gly Glu Arg 
275 280 285 

TGT GAC TTG GAG GAC AGG GAT AGA TCA GAG CTT AGC CCG CTG CTG CTG 912 
Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser. Pro Leu "Leu: Leu 
2 90 2 95 300 

TCT ACA ACA GGT GAT CGA GGG. CAG ACA' CCA TCA CCA CCA TCA CTA A 95 8 

Ser Thr Thr Gly Asp Arg Gly Gin Thr Pro Ser. Pro Pro Ser Leu 
305 310 315 



9S1 
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Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe Phe Ala Ala Leu Phe Tyr 
65 70 75 80 

Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys 
85 90 95 

Arg Ser lie Asp Lys Phe Ala Gin Gly Trp Gly Pro Leu Thr Tyr Thr 
100 105 110 

Glu Pro Asn Ser Ser Asp Gin Arg Pro Tyr Cys Trp His Tyr Ala Pro 
115 120 125 

Arg Pro Cys Gly lie Val Pro Ala Ser Gin val Cys Gly Pro Val Tyr 
130 135 140 

Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly 
145 150 155 160 

Val Pro Thr Tyr Asn Trp Gly Ala Asn Asp Ser Asp Val Leu He Leu 
165 170 175 

Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp Phe Gly cys Thr Trp Met 
180 185 190 

Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn He 
19S 200 205 

Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg 
210 . * 215 220 

Lys His Pro Glu Ala Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp Leu 
225 230 235 240 

Thr Pro Arg Cys Met Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro 
245 250 255 

Cys Thr Val Asn Phe Thr He Phe Lys Val Arg Met Tyr Val Gly Gly 
260 265i 270 

Val Glu His Arg Phe Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg 
275 280 285 

Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu 
290 295 300 

Ser Thr Thr Gly Asp Arg Gly Gin Thr Pro Ser Pro Pro Ser Leu 
305 310 315 

(2) INFORMATION FOR SEQ -ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1395 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..1392 

(ix) FEATURE: 

(A) NAME /KEY : mac_pepcide 
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(B) LOCATION: 1..1389 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



ATG GTG GCG GGG GCC CAT TGG GGA GTC CTG GCG GGC CTC GCC 

Met Val Ala Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala 

1 5 10 

TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG GTT GTG ATG CTA 

Ser Mec Val Gly Asn Trp Ala Lys Val Leu Val Val Met Leu 

20 25 30 

GCC GGC GTC GAC GGG CAT ACC CGC GTG TCA GGA GGG GCA GCA 

Ala Gly Val Asp Gly His Thr Arg Val Ser Gly Gly Ala Ala 

3 5 4 0 4 5 



TAC TAT 
Tyr Tyr 
15 

CTC TTT 
Leu Phe 



GCC TCC 
Ala Ser 



48 



95 



144 



GAT ACC AGG GGC CTT GTG TCC CTC TTT AGC CCC GGG TCG GCT CAG AAA 
Asp Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gin Lys 
50 55 60 



192 



• • • 
• • • 



ATC CAG CTC GTA AAC ACC AAC GGC AGT TGG CAC ATC AAC AGG ACT GCC 24 0 

He Gin Leu Val Asn Thr Asn Gly Ser Trp His He Asn Arg Thr Ala 
65 70 75 80 

CTG AAC TGC AAC GAC TCC CTC CAA ACA GGG TTC TTT GCC GCA CTA TTC 288 
Leu Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe Phe Ala Ala Leu Phe 
85 90 95 

TAC AAA CAC AAA TTC AAC TCG TCT GGA TGC CCA GAG CGC TTG GCC AGC 33 6 

Tyr Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser 
100 105 110 



TGT CGC. TCC ATC GAC AAG TTC GCT CAG GGG TGG GGT CCC CTC ACT TAC 
Cys Arg Ser lie Asp Lys Phe Ala Gin Gly Trp Gly Pro: Leu Thr Tyr 
115 120 125 



384 



ACT GAG CCT AAC AGC TCG GAC CAG AGG CCC TAC TGC TGG CAC TAC GCG 
Thr Glu Pro Asn Ser Ser Asp Gin Arg Pro Tyr Cys Trp His Tyr Ala 
130 135 . 14 0 



432 



CCT CGA CCG TGT GGT ATT GTA CCC GCG TCT CAG GTG TGC GGT CCA GTG 
Pro Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys Gly Pro Val 
145 150 155 160 



480 



TAT TGC TTC ACC CCG AGC CCT GTT GTG GTG GGG ACG ACC GAT CGG TTT 
Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe 
165 170 175 



528 



GGT GTC CCC ACG TAT AAC TGG GGG GCG AAC GAC TCG GAT GTG CTG ATT 
Gly Val Pro Thr Tyr Asn Trp Gly Ala Asn Asp Ser Asp Val Leu He 
180 185 190 



576 



CTC AAC AAC ACG CGG CCG CCG CGA GGC AAC TGG TTC GGC TGT ACA TGG 
Leu Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp Phe Gly Cys Thr Trp 
195 200 205 



624 



ATG AAT GGC ACT GGG TTC ACC AAG ACG TGT GGG GGC CCC CCG TGC AAC 
Mec Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pre Cys Asn 
210 215 " 220 



672 



ATC GGG GGG GCC GGC AAC AAC ACC TTG ACC TGC CCC ACT GAC TGT TTT 
He Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe 
225 " 230 235 240 



720 



CGG AAG CAC CCC GAG GCC ACC TAC GCC AGA TGC GGT TCT. GGG CCC TGG 
Arg Lys His Pro Glu Ala Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp 
245 250 255 



758 
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816 



CTG ACA CCT AGG TGT ATG GTT CAT TAC CCA TAT AGG CTC TGG CAC TAC 
Leu Thr Pro Arg Cys Mec Val His Tyr Pro Tyr i£g ™ £p Sis Tyr 
260 255 270 

J? C iSP S*? £* C H C ACC A T C 7X0 ^ GTT ft GG ATG TAC GTG GGG B6 4 
Pro Cys Thr Val Asn Phe Thr lie Phe Lys Val Arg Met Tyr Val Gly 
275 2B0 «r 



912 



28 5 

GGC GTG GAG CAC AGG TTC GAA GCC GCA TGC AAT TGG ACT CGA GGA GAG 
Gly Val Glu His Arg Phe Glu Ala Ala Cys Asn Trp Thr Arc Gly Glu 
29° 295 300 

CGT TGT GAC TTG GAG GAC AGG GAT AGA TCA GAG CTT AGC CCG CTG CTG 96 0 
Arg Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu 
305 310 315 320 

CTG TCT ACA ACA GAG TGG CAG ATA CTG CCC TGT TCC TTC ACC ACC CTG a 008 

Leu Ser Thr Thr Glu Trp Gin He Leu Pro Cys Ser Phe Thr Thr Leu 
325 330 335 

CCG GCC CTA TCC ACC GGC CTG ATC CAC CTC CAT CAG AAC ATC GTG GAC 
Pro Ala Leu Ser Thr Gly Leu He His Leu His Gin Asn He Val Asp 
340 345 350 



1056 



GTG CAA TAC CTG TAC GGT GTA GGG TCG GCG GTT GTC TCC CTT GTC ATC n 04 

Val Gin Tyr Leu Tyr Gly Val Gly Ser Ala Val Val Ser Leu val He 
355 350 365 

AAA TGG GAG TAT GTC CTG TTG CTC TTC CTT CTC CTG GCA GAC GCG CGC 1152 
Lys Trp Glu Tyr Val Leu Leu Leu Phe Leu Leu Leu Ala Asd Ala Arg 
370 375 380 

ATC TGC GCC TGC TTA TGG ATG ATG CTG CTG ATA GCT CAA GCT GAG GCC 12 00 

He Cys Ala Cys Leu Trp Met Met Leu Leu He Ala Gin Ala Glu Ala 
385 390 395 400 

GCC TTA GAG AAC CTG GTG GTC CTC AAT GCG GCG GCC GTG GCC GGG GCG 12^8 
Ala Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ala Val Ala Gly Ala 
405 410 415 

CAT GGC ACT CTT TCC TTC CTT GTG TTC TTC TGT GCT GCC TGG TAC ATC 12 96 

His Gly Thr Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trr> Tyr He 
420 425 " 430 

AAG GGC AGG CTG GTC CCT GGT GCG GCA TAC GCC TTC TAT GGC GTG TGG 1-4 4 

Lys Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala Phe Tyr Glv Val Trp 
435 440 445 

CCG CTG CTC CTG CTT CTG CTG GCC TTA CCA CCA CGA GCT TAT GCC TAGTAA 13 95 

Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro Arg Ala Tyr Ala 
450 ■ 455 460 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 63 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 6: 

Mec val Ala Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr 
1 5 10 15 

Ser Mec Val Gly Asn Trp Ala Lys Val Leu Val Val Mec Leu Leu =i= 
20 25 3C 
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Ala Gly Val Asp Gly His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser 
35 40 45 

Asp Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gin Lys 
50 55 60 

He Gin Leu Val Asn Thr As n Gly Ser Trp His He Asn Arg Thr Ala 
65 70 75 80 

Leu Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe Phe Ala Ala Leu Phe 
85 90 95 

Tyr Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser 
100 105 110 

Cys Arg Ser He Asp Lys Phe Ala Gin Gly Tro Gly Pro Leu Thr Tyr 
115 120 " 125 

Thr Glu Pro Asn Ser Ser Asp Gin Arg Pro Tyr Cys Trp His Tyr Ala 
130 135 140 

Pro Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys Gly Pro Val 
145 150 15S 160 

Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe 
,165 170 " 175 

Gly Val Pro Thr Tyr Asn Trp Gly Ala Asn Asp Ser Asp Val Leu He 
180 185 ISO 

Leu Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp Phe Gly Cys Thr Trp 
195 200 205 

Met Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn 
210 215 .220. 

He Gly Gly Ala Gly Asn: Asn Thr Leu Thr Cys. Pro Thr Asb Cys Phe 
225 23 0 235 24 0 

Arg Lys His Pro Glu Ala Thr Tyr Ala Arg Cys Gly Ser Gly Pro TrD 
245 250 " 255 

Leu Thr Pro Arg Cys Met Val His Tyr Pro Tyr Arg Leu Tro His TVr 
260 255 270 

Pro Cys Thr Val Asn Phe Thr He Phe Lys Val Arg Met Tyr Val Gly 
275 280 285 

Gly Val Glu His Arg Phe Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu 
290 295 300 

Arg Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu 
305 310 315 320 

Leu Ser Thr Thr Glu Trp Gin He Leu Pro Cys Ser Phe Thr Thr Leu 
325 330 335 

Pro Ala Leu Ser Thr Gly Leu He His Leu His Gin Asn He Val Asp 
340 345 350 

Val Gin Tyr Leu Tyr Gly Val Gly Ser Ala Val Val Sar Leu Val He 
355 3S0 355 

Lys Trp Glu Tyr Val Leu Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg 
370 375 380 

He Cys Ala Cys Leu Trp Met Met Leu Leu He Ala Gin Ala Glu Ala 
385 390 395. 400 



Ala Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ala val Ai a Gly Ala 
405 410 415 

His Gly Thr Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr lie 
420 425 430 

Lys Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp 
435 440 445 

Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro Arg Ala Tyr Ala 
450 455 460 

<2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2082 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(ix> FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1..2075 

(ix) FEATURE: 

(A) NAME /KEY : mat_peptide 

(B) LOCATION: 1..2076 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

AAT TTG ' GGT AAG GTC ATC GAT ACC CTT ACA TGC GGC TTC GCC GAC CTC 
Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Aso Leu 
15 10 15 

GTG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG 
Val Gly Tyr lie Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 * 30 

GCC CTG GCG CAT GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA 
Ala Leu Ala His Gly Val Afg Val Leu Glu Asd Gly Val Asn Tyr Ala 
35 40 45 

ACA GGG AAT TTG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG 
Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu 
50 55 60 

CTG TCC TGT CTG ACC GTT CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG 
Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
65 70 75 ~ 80 

TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 
Ser Gly Mec Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser He Val 
85 90 95 

TAT GAG GCA GCG GAC ATG ATC ATG. CAC ACC CCC GGG TGC GTG CCC TGC 
Tyr Glu Ala Ala Asp Mec He Met His Thr Pro Gly Cys Val Pro Cys 
100 105 110 



GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 
Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 
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H5 120 



125 



432 



CTC GCA GCT AGG AAC GCC AGC GTC CCC ACC ACG ACA ATA CGA CGC CAC 
Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr lie Arg Arg His 
13 0 135 14 0 

GTC GAT TTG CTC GTT GGG GCG GCT GCT TTC TGT TCC GCT ATG TAC GTG « Rn 
Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val 



S28 



576 



145 15b Isi iso 

GGG GAC CTC TGC GGA TCT GTC TTC CTC GTC TCC CAG CTG TTC ACC ATC 
Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr lie 
165 170 175 

TCG CCT CGC CGG CAT GAG ACG GTG CAG GAC TGC AAT TGC TCA ATC TAT 
Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser lie Tyr 
180 185 190 

CCC GGC CAC ATA ACG GGT CAC CGT ATG GCT TGG GAT ATG ATG ATG AAC S2 4 

Pro Gly His He Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn 
195 200 * 205 

TGG TCG CCT ACA. ACG GCC CTG GTG GTA TCG CAG CTG CTC CGG ATC CCA s 72 
Trp Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg He Pro 
210 215 220 

CAA GCT GTC GTG GAC ATG GTG GCG GGG GCC CAT TGG GGA GTC CTG GCG 72 0 

Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu Ala 
225 230 235 240 

GGC CTC GCC TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG GTT 768 
Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val Leu Val 
245 250 ' 255 

GTG ATG CTA CTC TTT GCC GGC GTC GAC GGG CAT ACC CGC GTG TCA GGA B16 
Val Met Leu Leu Phe Ala Gly Val Asp Gly His Thr Arg Val Ser Gly : 
260 265 270 

. GGG GCA GCA GCC TCC GAT ACC AGG GGC CTT GTG TCC CTC TTT AGC CCC 8 6* 

Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu Val Ser Leu Phe Ser Pro 
275 2B0 285 

GGG TCG GCT CAG AAA ATC CAG CTC GTA . AAC ACC AAC GGC AGT TGG CAC S12 
Gly Ser Ala Gin Lys . He Gin Leu Val Asn Thr Asn Gly Ser Trp His 
2 90 295 3 00 

ATC AAC AGG ACT GCC CTG AAC TGC AAC GAC TCC CTC CAA ACA GGG TTC 9S0 
He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe 
305 310 315 320 

TTT GCC GCA CTA TTC TAC AAA CAC AAA TTC AAC TCG TCT GGA TGC CCA 10 08 

Phe Ala Ala Leu Phe Tyr Lys His Lys Phe Asn Ser Ser Gly Cys Pro 
3 25 330 335 

GAG CGC TTG GCC AGC TGT CGC TCC ATC GAC AAG TTC GCT CAG GGG TGG 105S 
Glu Arg Leu Ala Ser Cys Arg Ser He Asb Lys Phe Ala Gin Gly Tro 
340 345 350 

GGT CCC CTC ACT TAC ACT GAG CCT AAC AGC TCG GAC CAG AGG CCC TAC 1104 
Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser Ser Asp' Gin Arg Pro Tv- 
355 3S0 365 

TGC TGG CAC TAC GCG CCT CGA CCG TGT GGT ATT GTA CCC GCG TCT CAG 1152 
Cys Trp His Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala Ser G'n 
370 375 * ' 380 

GTG TGC GGT CCA GTG TAT TGC TTC ACC CCG AGC CCT GTT GTG GTG GGG 12 00 

Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Glv 
3S5 390 395 400 



1344 



1392 



144 0 
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££ C ?* T CGG GGT 0X0 CCC ACG TA T AAC TGG GGG GCG AAC GAC 

Thr Thr Asp Arg Phe Gly val Pro Thr Tyr Asn Trp Gly Ala Asn Asp 1248 

405 410 415 

Ser 2£ S 7 ? f" 3 tT 1- T CTC AAC ^ ACG CGG CCG CCG CGA GGC AAC TGG 129 , 
Ser Asp Val Leu He Leu Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp 96 
420 425 430 

TTC GGC TGT ACA TGG ATG AAT GGC ACT GGG TTC ACC AAG ACG TCT GGG 
Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly 
435 4 4 o 445 

GGC CCC CCG TGC AAC ATC GGG GGG GCC GGC AAC AAC ACC TTG ACC TGC 
Gly Pro Pro Cys Asn He Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys 
450 455 460 

GCC GAC I, GT TTT CGG AAG CAC CCC GAG GCC ACC TAC GCC AGA TGC 

Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Ala Arg Cys 
465 47 ° 475 480 

GGT TCT GGG CCC TGG CTG ACA CCT AGG TGT ATG GTT CAT TAC CCA TAT 14 88 

Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Met Val His Tyr Pro Tyr 
48-5 490 • 495 

AGG CTC TGG CAC TAC CCC TGC ACT GTC AAC TTC ACC ATC TTC AAG GTT 1536 
Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He Phe Lys Val 
500 . 505 510 

AGG ATG TAC GTG GGG GGC GTG GAG CAC AGG TTC GAA GCC GCA TGC AAT 
Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Glu Ala Ala Cys Asn 
515 520 525 

TGG ACT CGA GGA GAG CGT TGT GAC TTG GAG GAC AGG GAT AGA TCA GAG 15 3 2 

Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp Ara Ser Glu 
530 535 54 . 0 

CTT AGC CCG CTG CTG CTG TCT ACA ACA GAG TGG CAG ATA CTG CCC TGT 16 8 0 

Leu Ser Pro Leu Leu Leu Ser Thr Thr Glu Trp Gin He Leu Pro Cys 
545 550 555 5 | 0 

TCC TTC ACC ACC CTG CCG GCC CTA TCC ACC GGC CTG ATC CAC CTC CAT 1728 
Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr Gly Leu He His Leu His 
565 570 " 575 

CAG AAC ATC GTG GAC GTG CAA TAC CTG TAC GGT GTA GGG TCG GCG GTT 17 7 6 

Gin Asn He Val Asp Val Gin Tyr Leu Tyr Gly Val Glv Ser Ala val 
580 585 " 590 

GTC TCC CTT GTC ATC AAA TGG GAG TAT GTC CTG TTG CTC TTC CTT 18 24 

Val Ser Leu Val He Lys Trp Glu Tyr Val Leu Leu : Leu Phe Leu Leu 
595 600 605 

CTG GCA GAC GCG CGC ATC TGC GCC TGC TTA TGG ATG ATG CTG CTG AT* IS 72 

Leu Ala Asp Ala Arg lie Cys Ala Cys Leu Trp Met Met Leu Leu He 
610 S15 620 

GCT CAA GCT Q AG GCC GCC TTA GAG AAC CTG GTG GTC CTC AAT GCG GCG 
Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val Leu Asn Ala A 1 a 
625 630 635 640 

GCC GTG GCC GGG GCG CAT GGC ACT CTT TCC TTC CTT GTG TTC TTC "f^T 
Ala Val Ala Gly Ala His Gly Thr Leu Ser Phe Leu Val Phe Phe Cvs 
645 650 655 ' 

GCT GCC JS. G TAC ATC AAG GGC AGG CTG GTC CCT GGT GCG GCA TAC GCC 2 016 

Aia Ala Trp Tyr He Lys Gly Arg Leu Val Pro Gly Ala Ala Tyr />ia 
6G0 665 ' 670 

TTC TAT GGC GTG TGG CCG CTG CTC CTG CTT CTG CTG GCC TTA CCA CCA 2 06 4 



1920 



1958 
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Phe Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro 
675 680 685 

CGA GCT TAT GCC TAGTAA 

Arg Ala Tyr Ala 2 °S2 
690 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 92 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 

1 5 -in. i r 



15 



Val Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
20 25 " 30 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
35 40 45 

Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala L=u 
50 55 SO 

Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
65 7 ° 75 B0 

Ser Gly Met Tyr His. Val Thr Asn Aso CysSer Asn Ser" Ser He Val 
85 50 95 

Tyr Glu Ala Ala Asp Met He Met His Thr Pro Gly Cys Val Pro Cys 
100 105 . no 

Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro ~h-r- 
■ H5 120 125 

Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arq Arc His 
130 135 140 

val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val 
145 150 155 ISO 

Gly Asp Leu Cys Gly. Ser Val Phe Leu Val Ser Gin Leu Phe Thr He 
165 170 175 

Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser 11= Tv- 
180 185 ; 190 ~ ' " 

Pro C-iy His He Thr Gly His Arg Met Ala Trp Asp Met Met Met »sn 
195 200 205 

Trp Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg He =ro 
21 0 215 220 

Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu Ala 
225 . 230 235 " 240 

Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val Leu Val 
. 2 45 250 ' 255 

Val Met Leu Leu Phe Ala Glv Val Asp Gly His Thr Arg Val Ser ^y 
260 265 270 
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Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu Val Ser Leu Phe Ser Pro 
275 280 285 

Gly Ser Ala Gin Lys He Gin Leu Val Asn Thr Asn Gly Ser Trp His 
290 295 300 

He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe 
305 310 315 320 

Phe Ala Ala Leu Phe Tyr Lys His Lys Phe Asn Ser Ser Gly Cys Pro 
325 330 335 

Glu Arg Leu Ala Ser Cys Arg Ser He Asp Lys Phe Ala Gin Gly Trp 
340 345 350 

Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser Ser Asp Gin Arg Pro Tyr 
355 360 365 

Cys Trp His Tyr Ala Pro Arg Pro Cys Gly lie Val Pro Ala Ser Gin 
370 375 380 

Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly 
385 390 395 400 

Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr Asn Trp Gly Ala Asn Asp 
405 . 410 " 415 

Ser Asp Val Leu lie Leu Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp 
420 425 ~ 430 

Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly 
435 440 445 

Gly Pro Pro Cys Asn He Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys 
450 4 55 460 

Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Ala Arg Cys 
465 470 475 480 

Gly Ser Gly Pro Trp. Leu Thr Pro Arg Cys Met Val His Tyr Pro Tyr 
485 490 495 

Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He Phe Lys Val 
500 " 505 510 

Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Glu Ala Ala Cys Asn 
515 520 525 

Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu 
530 535 540 

Leu Ser Pro Leu Leu Leu Ser Thr Thr Glu Trp Gin He Leu Pro Cys 
545 550 555 5S0 

Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr Gly Leu lie His Leu His 
555 570 575 

Gin Asn lie Val Asp Val Gin Tyr Leu Tyr Gly -Val Giv Ser Ala Val 
580 ' 585 590 

Val Ser Leu Val He Lys Trp Glu Tyr Val Leu Leu Leu Phe Leu Leu 
595 600 605 

Leu Ala Asp Ala Arg He Cys Ala Cys Leu Tm Met Met Leu Leu He 
610 ~ 615 " 620 

Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val Leu Asn Ala Ala 
62S 630 635 540 
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Ala Val Ala Gly Ala His Gly Thr Leu Ser Phe Leu Val Phe Phe Cy S 

645 650 ess r 

Ala Ala Trp Tyr He Lys Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala 
560 6G5 670 

Phe Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro 
675 680 

Arg Ala Tyr Ala 
690 

12) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 24 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



( ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..2430 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION : 1.. 24-27 



ATG 
Met 
1 

CGC 
Arg 



GGA 
Gly 



ACT 
Thr 



ATC 

He 
6 5 

TAC 
Tyr 



CTC 
Leu 



CGG 
Arg 



(xa/> SEQUENCE DESCRIPTION: SEQ ' ID NO: 4 9: 

AGC ACG AAT . CCT AAA CCT CAA AGA AAA ACC AAA CGT 
Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg 
5 10 

CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GG"<* CAG 
Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin 
20 25 

GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG TTG GGT 
Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 
35 40 " 45 

AGG AAG ACT TCC GAG CGG TCG CAA CCT CGT GGG AGG 
Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg 
50 55 60 

CCC AAG GCT CGC CGA CCC GAG GGT AGG GCC TGG GCT 
Pro Lys Ala Arg Arg Pro Glu Gly Arg Ala Tr D Ala 
70 75 

CCT TGG CCC CTC TAT GGC AAT GAG GGC ATG GGG' TGG 
Pro Trp Pro Leu Tyr Gly Asn Glu Gly Met Gly Trp 
8 5 go 

CTG TCA CCC CGC GGC TCT CGG CCT AGT TGG GGC CCT 
Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro 
100 los 

CGT : AGG TCG CGT AAT TTG GGT AAG GTC ATC GAT ACC 
Arg Arg Ser Arg Asn Leu Gly Lvs Val He Asp Thr 
US 120 125 



AAC ACC 
Asn Thr 
15 

ATC GTT 
He Val 
30 

GTG CGC 
Val Arg 



CGA CAA 
Arg Gin 



AAC 
Asn 



GGT 
Gly 



GCG 
Ala 



CCT 
Pro 



CAG CCC 
Gin Pro 



GCA GGA 
Ala Gly 
95 

ACA GAC 
Thr Asp 
110 

CTT ACA 
Leu Thr 



GGG 
Gly 

ao 

TGG 
Trp 



CCC 
?rd 



TC-C 
Cvs 



48 



96 



144 



192 



240 



268 



336 



384 
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GGC TTC GCC GAC 
Gly Phe Ala Asp 
130 

GGG GGC GCT GCC 
Gly Gly Ala Ala 
145 



CTC GTG 
Leu Val 



AGG GCC 
Arg Ala 
150 



GGG TAC 
Gly Tyr 
135 

CTG GCG 
Leu Ala 



ATT 
lie 



CAT 

His 



CCG CTC 
Pro Leu 



GGC 
Gly 



GTC 
Val 
155 



GTC 
Val 
14 0 

CGG 
Arg 



GGC GCC CCC CTA 
Glv Ala Pro Leu 



GTT CTG GAG 
Val Leu Glu 



GAC 
Asp 
160 



■432 



480 



GGC GTG AAC TAT 
Gly Val Asn Tyr 



GCA ACA 

Ala Thr 

165 • 



GGG AAT 
Gly Asn 



TTG 
Leu 



CCC 
Pro 
170 



GGT 
Gly 



TGC 
Cys 



TCT 

Ser 



TTC TCT 
Phe Ser 
175 



ATC 
lie 



528 



TTC CTC TTG 
Phe Leu Leu 



GCT 
Ala 
180 



TTG CTG 
Leu Leu 



TCC TGT 
Ser Cys 



CTG 
Leu 
185 



ACC 
Thr 



GTT 
Val 



CCA 
Pro 



GCT 
Ala 



TCC GCT 
Ser Ala 
190 



TAT 
Tyr 



576 



GAA GTG 
Glu Val 



CGC 
Arg 
195 



AAC 
Asn 



GTG TCC 
Val Ser 



GGG ATG 
Gly Met 
200 



TAC 
Tyr 



CAT GTC ACG 
His Val Thr 



AAC 
Asn 
205 



GAC TGC 
Asp Cys 



TCC 
Ser 



624 



AAC TCA 
Asn Ser 
210 



AGC 
Ser 



ATT 

lie 



GTG TAT 
Val Tyr 



GAG GCA 
Glu Ala 
215 



GCG 
Ala 



GAC 
Asp 



ATG 
Met 



ATC 
lie 
220 



ATG 
Met 



CAC ACC 
His Thr 



CCC 
Pro 



672 



GGG TGC 
Gly Cys 
225 



GTG 
Val 



CCC 
Pro 



TGC GTT 
Cys Val 
230 



CGG GAG 
Arg Glu 



AAC 
Asn 



AAC 
Asn 



TCT 
Ser 
235 



TCC 
Ser 



CGC TGC TGG 
Arg Cys Trp 



GTA 
Val 
240 



720 



GCG CTC 
Ala Leu 



ACC 
Thr 



CCC 
Pro 



ACG CTC 
Thr Leu 
245 



GCA GCT 
Ala Ala 



AGG 
Arg 



AAC 
Asn 
250 



GCC 
Ala 



AGC 
Ser 



GTC CCC 
Val Pro 



ACC 
Thr 
255 



ACG 
Thr 



7S8 



ACA ATA 
Thr lie 



CGA CGC 
Arg Arg 
260 



CAC GTC 
His Val 



GAT TTG 
Asp Leu 



CTC 

Leu 
265 



GTT 
Val 



GGG GCG 
Gly Ala 



GCT GCT 
Ala Ala 
270 



TTC 
Phe 



TGT 
Cys 



816 



TCC GCT 
Ser Ala 



ATG TAC 
Met Tyr 
275 



GTG GGG 
Val Gly 



GAC 
Asp 



CTC 
Leu 
280 



TGC 
Cys 



GGA TCT GTC 
Gly Ser Val 



TTC CTC 
Phe Leu 
285 



GTC 

val 



TCC 
Ser 



664 



CAG 
Gin 



CTG 
Leu 
290 



TTC ACC 
Phe Thr 



ATC TCG 
lie Ser 



CCT 
Pro 
295 



CGC 
Arg 



CGG 
Arg 



CAT 
His 



GAG 
Glu 



ACG 
Thr 
300 



GTG CAG 
Val Gin 



GAC 
Asp 



TGC 
Cvs 



912 



AAT 
Asn 
305 



TGC 
cys 



TCA ATC 

Ser lie 



TAT CCC 
Tyr Pro 
310 



GGC 
Gly 



CAC 
His 



ATA 
He 



ACG 

Thr 



GGT 
Gly 
315 



CAC 
His 



CGT ATG 
Arg Met 



GCT TGG 
Ala Trp 
320 



960 



GAT 

Asp 



ATG 
Met 



ATG ATG 
Met Met 



AAC TGG 
Asn Trp 
325 



TCG CCT 
Ser Pro 



ACA 
Thr 



ACG 
Thr 
330 



GCC 
Ala 



CTG 
Leu 



GTG GTA 
Val Val 



TCG CAG 
Ser Gin 
335 



1008 



CTG CTC 
Leu Leu 



CGG ATC 
Arg He 
340 



CCA CAA 
Pro Gin 



GCT GTC 
Ala Val 



GTG 
Val 
345 



GAC 
Asp 



ATG GTG 
Met Val 



GCG GGG 
Ala Glv 
3 50 



GCC CAT 
Ala His 



1056 



TGG GGA 
Trp Gly 



GTC CTG 
Val Leu 
355 



GCG GGC 
Ala Gly 



CTC GCC 
Leu Ala 
360 



TAC 

Tyr 



TAT TCC ATG 
Tyr Ser Met 



GTG GGG 
Val Gly 
365 



AAC TGG 
Asn Trp 



1104 



GCT AAG GTT TTG GTT GTG ATG CTA CTC TTT GCC GGC GTC GAC GGG CAT 
Ala Lys Val Leu Val Val Met Leu Leu Phe Ala Gly Val Asp Gly His 
370 375 380 



1152 



ACC CGC GTG TCA GGA GGG GCA GCA GCC TCC GAT ACC 
Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp Thr 
385 * 390 395 



AGG GGC CTT GTG 
Arg Gly Leu Val 
4 00 



1200 



TCC CTC TTT AGC CCC GGG TCG GCT CAG AAA ATC CAG CTC GTA AAC ACC 



124 3 
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Ser Leu Phe Ser Pro Gly Ser Ala Gin Lys lie Gin Leu Val Asn Thr 
4 05 410 4 15 

AAC GGC AGT TGG CAC ATC AAC AGG ACT GCC CTG AAC TGC AAC GAC TCC - i 2 9S 

Asn Gly Ser Trp His lie Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

CTC CAA ACA GGG TTC TTT GCC GCA CTA TTC TAC AAA CAC AAA TTC AAC 1344 
Leu Gin Thr Gly Phe Phe Ala Ala Leu Phe Tyr Lys His Lys Phe Asn 
435 440 445 

TGG TCT GGA TGC CCA GAG CGC TTG GCC AGC TGT CGC TCC ATC GAC AAG 13 92 

Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Ser lie Asp. Lys 
450 455 460 

TTC GCT CAG GGG TGG CZT CCC CTC ACT TAC ACT GAG CCT AAC AGC TCG 1440 
Phe Ala Gin Gly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser Ser 
465 470 475 480 

GAC CAG AGG CCC TAC TGC TGG CAC TAC GCG CCT CGA CCG TGT GGT ATT 14 8 8 

Asp Gin Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly lie 
485 490 495 

GTA CCC GCG TCT CAG GTG TGC GGT CCA GTG TAT TGC TTC ACC CCG AGC 15 3S 

Val Pro Ala Ser Gin Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser 
500 505 510 

CCT GTT GTG GTG GGG ACG ACC GAT CGG TTT GGT GTC CCC ACG TAT AAC -15 8 4 

Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr Asn 
515 520 525 

TGG GGG GCG AAC GAG TCG GAT GTG CTG ATT CTC AAC AAC ACG CGG CCG 15 32 

Trp Gly Ala Asn Asp Ser Asp Val Leu lie Leu Asn Asn Thr Arg Pro 
530 535 54 0 

CCG CGA GGC AAC TGG TTC GGC TGT ACA TGG ATG AAT GGC ACT GGG TTC 16 8 0 

Pro. Arg Gly Asn Trp Phe Gly Cys Thr Trp Mec Asn Gly Thr Gly Phe 
545 , 550 555 560 

ACC AAG ACG TGT GGG GGC CCC CCG TGC AAC ATC GGG GGG GCC GGC AAC 17.2 8 

Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn lie Gly Gly Ala Gly Asn 
565 570 " 575 

AAC ACC TTG ACC TGC CCC ACT GAC TGT TTT CGG AAG CAC CCC GAG GCC 1776 
Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala 
580 585 ' 590 

ACC TAC GCC AGA TGC GGT TCT GGG CCC TGG CTG ACA CCT AGG TGT ATG 18 24 

Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Met 
595 600 605 

GTT CAT TAC CCA TAT AGG CTC TGG CAC TAC CCC TGC ACT GTC AAC TTC 18 72 

Val His Tyr Pro Tyr Arg Leu Trp His Tvr Pro Cys Thr Val Asn ?he 
610 615 " 620 

ACC ATC TTC AAG GTT AGG ATG TAC GTG GGG GGC GTG GAG CAC AGG TTC 1920 
Thr' lie Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Phe 
625 630 635 540 

GAA GCC GCA TGC AAT TGG ACT CGA GGA GAG CGT TGT GAC TTG GAG GAC 19 68 

Glu Ala Ala. Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp 
645 650 655 

AGG GAT AGA TCA GAG CTT AGC CCG CTG CTG CTG TCT ACA ACA GAG TGG 2 016 

Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Glu Trp 
660 665 670 

CAG ATA CTG CCC TGT TCC TTC ACC ACC CTG CCG GCC CTA TCC ACC GGC 2 0 6 4 

Gin lie Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr Gly 
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675 680 585 

CTG ATC CAC CTC CAT CAG AAC ATC GTG GAC GTG CAA TAC CTG TAC GGT 2112 
Leu lie His Leu His Gin Asn lie Val Asp Val Gin Tyr Leu Tyr Gly 
690 695 700 

GTA GGG TCG GCG GTT GTC TCC CTT GTC ATC AAA TGG GAG TAT GTC CTG 216 0 

Val Gly Ser Ala Val Val Ser Leu Val lie Lys Trp Glu Tyr Val Leu 
70S 710 715 720 

TTG CTC TTC CTT CTC CTG GCA GAC GCG CGC ATC TGC GCC TGC TTA TGG 22 08 

Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg lie Cys Ala Cys Leu Trp 
725 730 735 

ATG ATG CTG CTG ATA GCT CAA GCT GAG GCC GCC TTA GAG AAC CTG GTG 22 5 6 

Met Met Leu Leu lie Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val 
740 745 750 

GTC CTC AAT GCG GCG GCC GTG GCC GGG GCG CAT GGC ACT CTT TCC TTC 2 3 0« 

Val Leu Asn Ala Ala Ala Val Ala Gly Ala His Gly Thr Leu Ser Phe 
755 760 765 

CTT GTG TTC TTC TGT GCT GCC TGG TAC ATC AAG GGC AGG CTG GTC -CCT 23 52 

Leu Val Phe Phe Cys Ala Ala Trp Tyr He Lys Gly Arg Leu Val Pro 
770 775 780 

GGT GCG GCA TAC GCC TTC TAT GGC GTG TGG CCG CTG CTC CTG CTT CTG 24 0 0 

Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro Leu : Leu Leu Leu Leu 
785 790 795 800 

CTG GCC TTA CCA CCA CGA GCT TAT GCC TAGTAA 24 3 3 

Leu Ala Leu Pro Pro Arg Ala Tyr Ala 

805 810 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 809 amino acids 

(B) TYPE: amino acid. 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5.0: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 . . 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu g!v Val Arg Ala 
3 5 4 0 4 5 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arc Gly Arg Aro Gin Pro 
50 55 60 

He Pro Lys Ala Arg Arg Pro Glu Gly Arg Ala Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Met Gly Trp Ala Gly Tro 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Airg Pro Ser Trp Gly Pro Thr Asp Pre 
100 105 110 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Tnr Leu Thr Cys 
115 120 125 
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Gly Phe Ala Asp Leu Val Gly Tyr He Pro Leu Val Gly Ala Pro Leu 
- L - 3U 135 X40 

Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 

150 155 160 

Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Glv Cys Ser Phe Ser He 
16S 1-70 17S 

Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tvr 
180 185 190 

Glu Val Arg Asn Val Ser Gly Met Tyr His Val Thr Asn Asp Cvs Ser 
195 200 205 

Asn Ser Ser He Val Tyr Glu Ala Ala Asp Met lis Met His Thr Pro 
210 215 220 

Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val 
225 23° 235 V 240 

Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr 
245 250 ■ 255 

Thr lie Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys 
260 265 270 

Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser 
275 280 285 

Gin Leu Phe Thr He Ser Pro Arg Arg His Glu Thr Val Gin Asp Cvs 
290 295 300 

Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val Se^ Gin 
325 330 335 

Leu Leu Arg He Pro Gin Ala Val Val Asp Met Val Ala Gly ni a -is 
34 0 345 ' 35Q 

Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn T rp 
355 360 365 

Ala Lys Val Leu Val Val Met Leu Leu Phe Ala Gly Val Asp Glv His 
3^0 375 380 

Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp . Thr Arg Gly Leu Val 
385 390 39.5 400 

Ser Leu Phe Ser Pro Gly Ser Ala Gin Lys Tie Gin Leu Val Asn Thr 
405 . 410 4is 

Asn Gly Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp 
420 425 430 

Leu Gin Thr Gly Phe Phe Ala Ala Leu Phe Tyr Lys His Lvs Phe *sn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Ser He Asp Lvs 
450 455 

Phe Ala Gin Gly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser S = r 
465 470 475 CB0 

Asp Gin Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cvs Gly He 
485 490 ' 495 



val Pro Ala Ser Gin Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser 
500 505 510 

Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr Asn 
S15 520 525 

Trp Gly Ala Asn Asp Ser Asp Val Leu lie Leu Asn Asn Thr Arg Pro 
530 S35 540 

Pro Arg Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe 
54S 550 555 560 

Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn lie Gly Gly Ala Gly Asn 
565 " 570 * " 575 

Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala 
580 585 590 

Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Met 
595 600 605 

Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe 
610 615 620 

Thr lie Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Phe 
625 630 635 640 

Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asd 
645 650 655 

Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Glu Trp 
660 665 670 

Gin lie Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr Gly 
675 680 685 

Leu lie His Leu His Gin Asn lie Val Asp val Gin Tyr Leu Tyr Gly 
690 695 * 700 

Val Gly Ser Ala Val Val Ser Leu Val lie Lys Trp Glu Tyr Val Leu 
705 710 715 * 720 

Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg lie Cys Ala Cys Leu Tro 
725 730 * 735 

Met Met Leu Leu lie Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val 
740 745 750 

Val. Leu Asn Ala Ala Ala Val Ala Gly Ala His Gly Thr Leu Ser Phe 
755 760 " 765 

Leu Val Phe Phe Cys Ala Ala Trp Tyr He Lys Gly Arg Leu Val Pro 
770 775 780 

Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu 
785 790 795 S0O 

Leu Ala Leu Pro Pro Arg Ala Tyr Ala 
805 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

( ix) FEATURE : 

(A) NAME/KEY; Modif ied-S ite 

(B) LOCATION: 1..17 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 51: 




Val 



) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 2 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE : 

(A) NAME/KEY: Modif ied-site-- 

(B) LOCATION: 1 . .22 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Gly Gly lie Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 
1 S io 15 

Ser Pro Thr Thr Ala Leu 
20 

INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE : 

(A) NAME /KEY : Modi f ied-s it e 

(B) LOCATION: 1 . . 37 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Tyr Glu Val Arg Asn Val Ser Gly lie- Tyr His Val Thr Asn Asp Cys 
1 5 io 15 

Ser Asn Ser Ser He Val Tyr Glu Ala Ala Asp Met He Met His Thr 
20 25 30 

Pro Gly Cys Gly Lys 
35 

INFORMATION FOR SEQ ID NO: 54: 
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(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 

(A) NAME /KEY : Modif zed-site 

(B) LOCATION: 1..25 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 

Asp 

10 15 



Gly Gly Thr Pro Thr Val Ala Thr Arg Asp Gly Lys Leu Pre Ala Thr 



Gin Leu Arg Arg His He Asp Leu Leu 
20 ~ 25 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 
CB) TYPE: amino acid 
CO STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 



(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1..25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Gly Gly Thr Pro Thr Leu Ala Ala Arg Asp Ala Ser Val Pro Thr Thr 
1 5 io 15 • - 

Thr He Arg Arg His Val Asp Leu Leu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 



(xi> SEQUENCE DESCRIPTION : SEQ ID NO: 56: 

Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Gin Val Arg Asn 
1 5 io -- 3 



13 



Ser Thr Gly Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 57; 
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(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C> STRAND EDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Gin Val Arg Asn Ser Thr Gly Leu Tyr His Val Thr Asn Asp Cys P-o 

Asn Ser Ser lie 
20 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Asn Asp Cys Pro Asn Ser Ser lie Val Tyr Glu Ala His Asn Ala He 
1 5 10 ' 15 

Leu His Thr Pro 
20 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C> STRANDEDNESS: Single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Ser Asn Ser Ser He Val Tyr Glu Ala Ala Asp Met He Met "is T-- 
1 5 10 15 

Pro Gly Cys Val 
20 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i! SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 60: 

His Asp Ala lie Leu His Thr Pro Gly Val Pro Cys val Arg Glu Glv 
1 5 10 15 

Asn Val Ser 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Cys val Arg Glu Gly Asn Val Ser Arg Cys Trp Val Ala Met Thr P'-o 
1 s 10 15 

Thr Val Ala Thr 
20 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

CO STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 62: 

Ala Met Thr Pro Thr Val Ala Thr Arg Asp Glv Lvs Leu Pro Ala Thr ' 
1 5 10 15 

Gin Leu Arg Arg 
20 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Leu Pro Ala Thr Gin Leu Arg Arg Kis He Asp Leu Leu Val Glv Ser 
1 5 10 is' 

Ala Thr Leu Cys 
20 

(2) INFORMATION FOR SEQ ID NO: 64: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) . LENGTH; 20 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Leu val Gly Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu 

10 ^5 

Cys Gly Ser Val 
20 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65.: 

Gin Leu Phe Thr Phe Ser Pro Arg Arg His Tr D Thr Thr Gin Gly Cys 
1 5 1C -- 



15 



Asn Cys Ser II t 
20 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Thr Gin Gly Cys Asn Cys Ser He Tyr Pro Giv His He Thr G'y 
1 5 . 10 15 

Arg Met Ala Trp 
20 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

lie Thr Gly His Arg Met Ala Trp Asp Mec Met Mec Asn Trp Ser Pro 
1 S io 15 

Thr Ala Ala Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 68: 

Asn Trp Ser Pro Thr Ala Ala Leu Val Mec Ala Gin Leu Leu Arg lie 
1*5 10 15 

Pro Gin Ala lie 
20 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Leu Leu Arg He Pro Gin Ala He Leu Asp Mec He Ala Gly Ala His 
1*5 10 15 

Trp Gly Val Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 70; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Ala Gly Ala His Trp Gly Val Leu Ala Gly lie Ala Tyr Phe Ser Ms 
1 5 10 15 

Val Gly Asn MeC 
20 
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(2) INFORMATION FOR SEQ ID NO: 71: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Val Val Leu Leu Leu Phe Ala Gly Val Asd Ala Glu Thr" He Val Se- 
1 5 io" 15 

Gly Gly Gin Ala 
20 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS.: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Ser Gly Leu Val Ser Leu Phe Thr Pro Gly Ala Lvs Gin Asr. He Gin 
1 5 10 15 

Leu He Asn Thr 

.2 0 

(2) INFORMATION FOR SEQ ID NO: 73: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 amino. acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS.: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



"(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

C-ln Asn He Gin Leu He Asn Thr Asn Glv Gin Trp His He Asn S-r 
1 5 io' 15 

Thr Ala Leu Asn 
20 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i)' SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

lii) MOLECULE TYPE-, peptide 



• • • » • 
• • • 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 74: 

Leu Asn Cys Asn Glu Ser Leu Asn Thr Gly Trp TrD Leu Ala Gly Leu 
1 5 10 15 

lie Tyr Gin His Lys 
20 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

CO STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Ala Gly Leu lie Tyr- Gin His Lys Phe Asn Ser Ser Gly Cys Pro CI 
1 5 in r 



5 10 

Arg Leu Ala Ser 
20 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



15 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Gly Cys Pro Giu Arg Leu Ala Ser Cvs Arg Pro Leu Thr Aso =he A=- 
1 5 'in * •. = " 



5 ' io" 

Gin Gly Trp Gly 
20 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Thr Asp Phe Asp Gin Gly Trp Gly Pro He Ser Tyr Ala Asn Glv S = - 
1 5 10 15 

Gly Pro Asd Gin 
20 
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(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Ala Asn Gly Ser Gly Pro Asp Gin Arg Pro Tyr Cys Tro His Tvr Pro 
1 S 10 "is 

Pro Lys Pro Cys 
20 

(2) INFORMATION FOR SEQ ID NO: 7 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DES'CRI PTION : SEQ ID NO: 79: 

Trp His Tyr Pro Pro Lys Pro Cys Glv lie Val Pro Ala Lvs Ser Val 
1 5 10 " 15 

Cys Gly Pro Val 
20 

(2> INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ. ID NO: 80: 

Ala Lys Ser Val Cvs Gly Pro Val Tyr Cvs Phe Thr Pro Ser Pro Val 
1 5" 10 15 

Val Val Gly Thr 
20 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 1: 

Pro Ser Fro Val Val Val Gly Thr Thr Asd Arg Ser GXv Ala Pro Th- 
1 5 io" * 15 

Tyr Ser Trp Giy 
20 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A). LENGTH: 20 amino acids 
CB) TYPE: amino acid 
(C) STRANDEDNESS : single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Gly Ala Pro Thr Tyr Ser Trp Gly Glu Asn Aso Thr Asp Val Phe Val 
1 5 10 15 

Leu Asn Asn Thr 
20 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
CO) TOPOLOGY: linear 

(ii)' MOLECULE TYPE: peptide 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Giv Asn Trp ?he Glv Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lvs 
1 5 * 10 15 

val Cys Gly Ala 
20 

(2) INFORMATION FOR SEQ ID NO: 64: 

!i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Gly Phe Thr Lys Val Cys Gly Ala Pro Pro Val Cys He Gly Glv P.la 
IS 10 15' 

Gly Asn Asn Thr 
20 
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(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

lie Gly Gly Ala Gly Asn Asn Thr Leu His Cys Pro Thr- Asd Cys 
1 5 10 " 15 

Lys His Pro 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 
* (A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide. 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Thr Asp Cys Phe Arg Lys His Pro Asp Ala Thr Tvr Ser Arg Cys Gly 
1 5 10 15 

Ser Gly Pro Tro - 
20 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2.0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 



(xi) SEQUENCE DESCRIPTION: SEQ. ID NO: 87: 

Ser Arg Cys Gly Ser Gly Pro Trp He Thr Pro Arg Cvs Leu Val Aso 
1 5 " 10 ' 15 

Tyr Pro Tyr Arg 
20 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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<xi) SEQUENCE DESCRIPTION : SEQ ID NO: 88: 

Cys Leu Val Asp Tyr Pro Tyx Arg Leu Trp His Tyr Pro Cys Th»- T i . 
1 5 io 15 

Asn Tyr Thr lie 
20 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Pro Cys Thr He Asn Tyr Thr He Phe Lys He Arg Met Tyr Val Glv 
1 5 10 15 

Gly Val Glu His 
20 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH; 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Met Tyr Val Gly Gly Val Glu His Arg Leu Glu Ala Ala Cys Asn Tr= 
1 5 10 15 

Thr Pro Gly Glu 
20 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Ala Cys Asn Trp Thr Pro Gly Glu Arg Cys Asp Leu Glu Aso Arg Asc 
1 S 10 ' " 15 

Arg Ser Glu Leu 
20 
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(2) INFORMATION FOR SEQ ID NO: 92: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii! MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu L=u Th 
1 5 10 

Gin Trp. Gin Val 
20 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 93: 

Tyr Gin Val Arg Asn Ser Thr Gly Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

( i i ) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO-. 94: 
ACGTCCGTAC GTTCGAATTA ATTAATCGA 2 9 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: GO base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL : NO 
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(iii) ANTI- SENSE: YES 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 
CCTCCGGACG TGCACTAGCT CCCGTCTGTG GTAGTGGTGG TAGTGATTAT CAATTAATTC 



SO 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: -NO 
(iii) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 96: 
GTTTAAC CAC TGCATGATG 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base Dairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
GTCCCATCGA GTGCGGCTAC 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 
(A). LENGTH: 45 base pairs 
(S) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 



1 9 



20 
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CGTGACATGG TACATTCCGG ACACTTGGCG CACTTCATAA GCGGA 
(2) INFORMATION FOR SEQ ID NO : 99: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
TGCCTCATAC AC AATGG AG C TCTGGGACGA GTCGTTCGTG AC 
(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 2 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sinols 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
TACCCAGCAG CGGGAGCTCT GTTGCTCCCG AACGCAGGGC AC 
(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 
TGTCGTGGTG GGGACGGAGG CCTGCCTAGC TGCGAGCGTG GG 
(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
CGTTATG TGG CCCGGGTAGA TTGAGCACTG GCAGTCCTGC ACCGTCTC 
(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: i03\- 
CAGGGCCGTT CTAGGCCTCC ACTGCATCAT CATATCCCAA GC 
(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
CCGGAATGTA CCATGTCACG AACGAC 
(2) INFORMATION FOR SEQ ID NO: 105: 

!i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
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GCTCCATTGT GTATGAGGCA GCGG 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
GAGCTCCCGC TGCTGGGTAG CGC 23 
(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 
CCTCCGTCCC CACCACGACA ATACG .25 
(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
CTACCCGGGC CACATAACGG GTCACCG 2 7 

(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



Cxi) 
(iii) 
(iii) 



MOLECULE TYPE.- DNA (genomic) 
HYPOTHETICAL: NO 
ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
GGAGGCCTAC AACGGCCCTG GTGG 
(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION-. SEQ ID NO 
TTCTATCGAT TAAATAGAAT TC 
(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
GCCATACGCT CACAGCCGAT CCC 
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THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS :- 

1 . An isolated HCV single or specific oligomeric envelope protein selected from the 
group consisting of El and/or E2 and/or E1/E2, having a purity degree of at least 80%. 

2. An isolated HCV single or specific oligomeric envelope protein selected from the 
group consisting of El and/or E2 and/or E1/E2, having a purity degree of at least 90%. 

3 . An isolated HCV single or specific oligomeric envelope protein selected from the 
group consisting of El and/or E2 and/or E1/E2, having a purity degree of at least 95%. 

4. An isolated HCV envelope protein according to any one of claims 1 to 3, wherein 
said isolated HCV envelope proteins are expressed from recombinant mammalian cells 
such as by using a vaccinia virus based system. 

5. An isolated HCV envelope protein according to claim 4, wherein said isolated 
IICV envelope proteins are expressed from recombinant yeast cells. 

6. An isolated HCV envelope protein according to any one of claims 1 to 5, for use 
as a medicament. 

7. . An isolated HCV envelope protein according to any one of claims 1 to 5, for use 
as a vaccine for immunizing a mammal against HCV, comprising administering an 
effective amount of said composition, optionally accompanied by pharmaceutically 
acceptable adjuvants, to produce an immune response 

8. An isolated HCV envelope protein according to claim 7 wherein said mammal is 
human. 

9. A method for immunising a mammal against HCV, comprising the steps of 
administering to said mammal an effective amount of an isolated HCV envelope protein 
according to any one of claims 1 to 5, to produce an immune response. 
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10. A method according to claim 9, wherein said mammal is human. 

11. A vaccine composition for immunizing a mammal against HCV, comprising an 
effective amount of an isolated HCV envelope protein according to any one of claims 1 
to 5, optionally accompanied by pharmaceutical^ acceptable adjuvants. 

12. A vaccine composition according to claim 1 1 wherein said mammal is human. 

13. An isolated HCV envelope protein according to any one of claims 1 to 5, for in 
vitro detection of HCV antibodies present in a biological sample. 

14. A method for in vitro diagnosis of HCV antibodies present in a biological 
sample, comprising at least the following steps: 

(i) contacting said biological sample with an isolated HCV envelope protein 
according to any one of claims 1 to 5, under appropriate conditions, which 
allow the formation of an immune complex, 

(ii) removing unbound components, 

(in) incubating the immune complexes formed with heterologous antibodies, 
with said heterologous antibodies being, conjugated to a detectable label 
under appropriate conditions, 

(iv) detecting the presence of said immune complexes visually or 
mechanically. 

15. A method according to claim 14 wherein said isolated HCV envelope protein is 
in an immobilised form. 

1 6. A kit for determining the presence of HCV antibodies present in a biological 
sample, comprising: 
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at least one isolated HCV envelope protein according to any one of claims 1 to 5, 
a buffer or components necessary for producing the buffer enabling binding 

reaction between these proteins and antibodies against HCV present i n said 

biological sample, 

> - a means for detecting the immune complexes formed in the preceding binding 
reaction. 

17. A kit according to claim 16 wherein said at least one isolated HCV envelope 
protein is in an immobilised form on a solid substrate. 

18. Use of an isolated HCV envelope protein according to any one of claims 1 to 5, 
comprising HCV El protein, for in vitro monitoring HCV disease or prognosing the 
response to treatment of patients suffering from HCV infection comprising: 

incubating a biological sample from a patient with HCV infection with an E 1 
protein or a suitable part thereof under conditions allowing the formation of an 
immunological complex, 
removing unbound components, 

calculating the anti-El titers present in said sample at the start of and during the 
course of treatment, 

monitoring, the natural course of HCV disease, or prognosing the response to 
treatment of said patient on the basis of the amount of anti-El titers found in said sample 
at the start of treatment and/or during the course of treatment. 
19. Use according to claim 18 wherein the HCV El protein is a HCV single El 
protein. 
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20. Use according to any one of claims 1 8 or 1 9 wherein said treatment is treatment 
with interferon. 

21. A kit for monitoring HCV disease or prognosing the response to treatment of 
patients suffering from HCV infection comprising: 

at least one isolated HCV envelope protein according to any one of claims 1 to 5, 
a buffer or components necessary for producing the buffer enabling the binding 
reaction between these proteins and the anti-El antibodies present in a biological 
sample, 

means for detecting the immune complexes formed in the preceding binding 
reaction, and 

optionally also an automated scanning and interpretation device for inferring a 
decrease of anti-El titers during the progression of treatment. 

22. A kit according to claim 21 wherein said at least one isolated HCV envelope 
protein is an El protein. 

23. A kit according to claim 21 or claim 22, wherein said treatment is treatment with 
interferon. 

24. A sero typing assay for detecting one or more serological types of HCV present in 
a biological sample, comprising at least the following steps : 

(i) contacting the biological sample to be analyzed for the presence of HCV 
antibodies of one or more serological types, with at least one isolated 
HCV El and/or E2 and/or E1/E2 protein according to any one of claims 1 
to 5, under appropriate conditions which allow the formation of an 
immune complex, 
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(u) removing unbound components, 

Hii) incubating the i mmune complexes formed with heterologous antibodies. ' 

with said heterologous antibodies being conjugated to a detectable label 

under appropriate conditions, 
(i v) detecting the presence of said urunune complexes visually or 

mechanically (e.g. by means of densitometry, fluorimetry, colorirnetry) 

and inferring the presence of one or more HCV serological types present 

from the observed binding pattern. 

25. A serotyping assay according to claim 24 for detecting antibodies of the different 
types of HCV combined in one assay format. 

26. A serotyping assay according to claim 24 or claim 25 wherein said at least one 
isolated HCV El and/or E2 and/or El /E2 protein is in an immobilised forn-L 

27. A lot for serotyping one or more serological types of HCV present in a biological 
sample, comprising: 

at least one isolated HCV El and/or E2 and/or E1/E2 protein according to any 
one of claims lto 5, 

a buffer or components necessary for producing the buffer enabling the binding 
reaction between these proteins and the anti-El antibodies present in a biological 
sample, 

means for detecting the immune complexes formed in the preceding binding 
reaction, and 
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optionally also an automated scanning and interpretation device for detecting the 
presence of one or more serological types present from the observed binding 
pattern. 

28. A kit according to claim 27 for detecting antibodies to said serological types of 
5 HCV. 

29. An isolated HCV envelope protein according to any one of claims 1 to 5. to raise 
upon immunization an El and/or E2 specific monoclonal antibody. 

30. An isolated HCV envelope protein according to any one of claims 1 to 5, for the 
preparation of an immunoassay kit. 

10 31. Use of an isolated HCV envelope protein according to any one of claims 1 to 5, 
for detecting HCV antibodies present in a biological sample. 

32. Use of an isolated HCV envelope protein according to any one of claims 1 to 5, 
for the manufacture of a medicament for immunising a mammal against HCV. 

33. Use according to claim 32 wherein said mammal is human. 

15 34. An isolated HCV single or specific oligomeric envelope protein, substantially as 
herein described with reference to one or more of the examples but excluding 
comparative examples. 

35. A method for immunising a mammal against HCV, substantially as herein 
described with reference to one or more of the examples but excluding comparative 

20 examples. 

36. A vaccine composition for immunising a mammal against HCV, substantially as 
herein described with reference to one or more of the examples but excluding 
comparative examples. 
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37 . A method for in vitro diagnosis of HCV antibodies present in a biological sample, 
substantially as herein described with reference to one or more of the examples but 
excluding comparative examples. 

38. A kit for determining the presence of HCV antibodies present in a biological 
sample, substantially as herein described with reference to one or more of the examples 
but excluding comparative examples. 

39. Use of an isolated HCV single or specific oligomeric envelope protein, 
substantially as herein described with reference to one or more of the examples but 
excluding comparative examples. 

40. A kit for monitoring HCV disease or prognosing the response to treatment of 
patients suffering from HCV infection, substantially as herein described with reference 
to one or more of the examples but excluding comparative examples. 

41. A serotyping assay for detecting one or more serological types of HC V present in a 
biological sample, substantially as herein described with reference to one or more of the 
examples but excluding comparative examples. 

42. A kit for serotyping one or more serological types of HCV present in a biological 

sample, substantially as herein described with reference to one or more of the examples 

but excluding comparative examples. 

DATED this 29th Day of October, 1999 
INNOGENETICS N.V. 

roi^i Attorne y : MW A. RAJKOVIC 
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Figure 21 



5" GGCATGCAAGCTTAATTAATT3' (SEQ ID NO 1) 
3-ACGTCCGTACGTTCGAATTAATTAATCGA5' (SEQ ID NO 94) 



5-CCGGGGAGGCCTGCACGTGATCGAGGGCAGACACCATCACCACCATCACTAATAG7 
TAA I i AACTGCA 3" (SEQ ID NO 2) 

3'CCTCCGGACGTGCACTAGCTCCCGTCTGTGGTAGTGGTGGTAGTGATTATCAATTAATTG 
= ' (SEQ ID NO 95) 



GCTTCCGCTTATGAGGTGCGCAACGTGTCCGGGATGTACCATGTCACG. 
CCAACTCAAGCATTGTGTATGAGGCAGCGGACATGATCATGCACACCCC 



G 



SEQ ID NO 3 (HCCI9A) 

ATGCCCGGTTGCTCTrTCTCTATCTTCCTCrrGGCTTTACTGTCCTGTCTGACCATTCCA 

AACGACTGCT 
CCGGGTGCGT 

GCCCTGCGTTCGGGAGAACAACTCTTCCCGCTGCTGGGTAGCGCTCACCCCCACGCTC 
GCAGCTAGGAACGCCAGCGTCCCCACCACGACAATACGACGCCACGTCGATTTGCTCG 
TTGGGGCGGCTGCTCTCTGTTCCGCTATGTACGTGGGGGATCTCTGCGGATCTGTCTTC 
CTCGTCTCCCAGCTGTTCACCATCTCGGCTCGCCGGCATGAGACGGTGCAGGACTGCA 
ATTGCTCAATCTATCCCGGCCACATAACAGGTCACCGTATGGCTTGGGATATGATGAT 
AACTGGTCGCCTACAACGGCCCTGGTGGTATCGCAGCTGCTCCGGATCCCACAAGCT 
GTCGTGGACATGGTGGCGGGGGCCCATTGGGGAGTCCTGGCGGGCCTCGCCTACTATT 
CCATGGTGGGGAACTGGGCTAAGGTTTTGATTGTGATGCTACTCTTTGCTCTCTAATAG 

SEQ ID NO 5 (HCCI10A) 

ATGTTGGGTAAGGTCATCGATACCCTTACATGCGGCTTCGCCGACCTCGTGGGGTACA 
TTCCGCTCGTCGGCGCCCCCCTAGGGGGCGCTGCCAGGGCGCTGGCGCATGGCGTCCG 
GGTTCTGGAGGACGGCGTGAACTATGCAACAGGGAATTTGCCCGGTTGCTCTTTCTCT 
ATCTTCCTCrrGGCTTTGCTGTCCTGTCTGACCGTTCCAGCTTCCGCTTATGAAGTGCG 
CAACGTGTCCGGGATGTACCATGTCACGAACGACTGCTCCAACTCAAGCATTGTGTAT 
GAGGCAGCGGACATGATCATGCACACCCCCGGGTGCGTGCCCTGCGTTCGGGAGAAC 
^ ACTC ™CGCTGCTGGGTAGCGCTCACCCCCACGCTCGCAGCTAGGAACGCCAGCG 
TCCCCACCACGACAATACGACGCCACGTCGATTTGCTCGTTGGGGCGGCTGCTTTCTG 



22/59 

TTCCGCTATGTACGTGGGGGACCTCTGCGGATCTGTCTTCCTCGTCTCCCAGCTGTTCA 
CCATCTCGCC . CGCCGGCATGAGACGGTGCAGG ACTGCAATTGCTCAATC7ATCCCGG 
CCACATAACGGGTCACCGTATGGCTTGGGATATGATGATGAACTGGTCGCCTACAACG 
GCCCTGGTGGTATCGeAGCTGCTCCGGATCCCACAAGCTGTCGTGGACATGGTGGCGG 

GGGCCCATTGGGGAGTCCTGGCGGGTCTCGCGTACTATTCCATGGTGGGGAACTGGGC 
i AAGGTTTTGATTGTGATGCTACTCTTTGCTCCCTAATAG 

SEQ ID NO 7 (HCCI1 1 A) 

ATGTTGGGTAAGG7CATCGATACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACA 
TTCCGCTCGTCGGCGCCCCCCTAGGGGGTGCTGCCAGAGCCCTGGCGCATGGCGTCCG 
GGTTCTGGAAGACGGCGTGAACTATGCAACAGGGAATTTGCCTGGTTGCTCTTTCTCTA 
TCTTeCTCTTGGCTTTACTGTCCTGTCTGACCATTCCAGCTTCCGCTTATGAGGTGCGC 
AACGTGTCCGGGATGTACCATGTCACGAACGACTGCTCCAACTCAAGCATTGTGTATG 
AGGCAGCGGACATGATCATGCACACCCCCGGGTGCGTGCCCTGCGTTCGGGAGAACA 
ACTC , TCCCGCTGCTGGGTAGC6CTCACCCCCACGCTCGCAGCTAGGAAC6CCAGCGT 
CCCCACTACGACAATACGACGGCACGTCGATTTGCTCG-n-GGGGCGGCTGCTTTCTGTT 
CCGCTATGTACGTGGGGGATCTCTGCGGATCTGTCTTCCTCGTCTCCCAGCTGTTCACC 
ATCTCGCCTCGCCGGCATGAGACGGTGCAGGACTGCAATTGCTCAATCTATCCCGGCC 
ACATAACAGGTCACCGTATGGCTTGGGATATGATGATGAACTGGTAATAG 

SEQ ID NO 9 (HCCI12A) 

ATGCCCGGTtGCTCTTTCTCTATCTTCCTCTTGGCCCTGCTGTCCTGTCTGACCATACCA 
GCTTCCGCTTATGAAGTGCGCAACGTGTCCGGGGTGTACCATGTCACGAACGAGTGCT 
CCAACTCAAGCATAGTGTATGAGGCAGCGGACATGATCATGCACACCCCCGGGTGCGT 
GCCeTGCGrrCGGGAGGGCAACTCCTCGCGTTGCTGGGTGGCGCTCACTCCCACGCTC 
GCGGCCAGGAACGCCAGCGTCCCCACAACGACAATACGACGCCACGTCGATTTGCTC 
GTTGGGGCTGCTGCTTTCTGTTCCGCTATGTACGTGGGGGATCTCTGCGGATCTG- 



CCTTGTTTCCCAGCTGTTCACCTTCTCACCTCGCCC 



TT 

-GGGATGAAACAGTACAGGACTGCA 

ACTGCTCAATCTATCCCGGCCATGTATCAGGTCACCGCATGGCrTGGGATATGATGAT 
GAACTGGTCCTAATAG 



SEQ ID NO 1 1 (HCCI13A) 

ATGTCCGGTTGCTCTTTCTCTATCTTCCTCTTGGCCCTGCTGTCCTGTCTGACCATACCA 



GCTTCCGCTTATGAAGTGCGCAACGTGTCCG 



GGGTGTACCATGTCACGAACG ACTGCT 
CCAACTCAAGCATAGTGTATGAGGCAGCGGACATGATCATGCACACCCCCGGGTGC 



GT 
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GCCCTGCGTTCGGGAGQGCAACTCCTCCCGTTGCTGGGTGGCGCTCACTCCCACGCTC 
GCGGCCAGGAACGCCAGCGTCCCCACAACGACAATACGACGCCA-CGTCGATTTGCTC 
GTTGGGGCTGCTGCTTTCTGTTCCGCTATGTACGTGGGGGATCTCTGCGGATCTGTTTT 
CCTTGTTTCCCAGCTGTTCACCTTCTCACCTCGCCGGCATCAAACAGTACAGGACTGCA 

ACTGCTCAATCTATCCCGGCCATGTATC AGGTCACCGCATGGCTTGGGATATGATGAT 
GAACTGGTAATAG 

SEQ ID NO 1 3 (HCCI17A) 

ATGCTGGGTAAGGCC ATCGATACCCTTACGTGCGGCTTCGCCGACCTCGTGGGGTACA 

TTCCGCTCGTCGGCGCCCCCCTAGGGGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCG 

GGTTCTGGAAGACGGCGTGAACTATGCAACAGGGAATTTGCCTGGTTGCTCTTTCTCTA 

TCTTCCTCTTGGCTTTACTGTCCTGTCTAACCATTCCAGCTTCCGCTTACGAGGTGCGC 

AACGTGTCCGGGATGTACCATGTCACGAACGACTGCTCCAACTCAAGGATTGTGTATG 

AGGCAGCGGACATGATCATGCACACCCCCGGGTGCGTGCCCTGCG7TCGGGAGAACA 

ACTCTTCCCGCTGCTGGGTAGCGCTCACCCCCACGCTCGCGGCTAGGAACGCCAGCAT 

CCCCACTACAACAATACGACGCCACGTCGATTTGCTCGTTGGGGCGGCTGCTTTCTGTT 

CCGCTATGTACGTGGGGGATCTCTGCGGATCTGTCTTCCTCGTCTCCCAGCTGTTCACC 

ATCTCGCCTCGCCGGCATGAGACGGTGCAGGACTGCAATTGCTCAATCTATCCCGGCC 

ACATAACGGGTCACCGTATGGCTTGGGATATGATGATGAACTGGTACTAATAG 

SEQ ID NO 15 (HCPrSD 
ATGCCCGGTTGCTCTTTCTCTATCTT 

SEQ ID NO 16 <HCPr52) 
ATGTTGGGTAAGGTCATCGATACCCT 

SEQ ID NO 17 (HCPr53) 

CTATTAGGACCAGTTCATCATCATATCCCA 

SEQ ID NO 1 8 <HCPr54) 
CTATTACCAGTTCATCATCATATCCCA 

SEQ ID NO 19 (HCPr107) 

ATACGACGCCACGTCGATTCCCAGCTGTTCACCATC 
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SEQ ID NO 20 (HCPM03) 

GATGGTGAACAGCTGGGAATCGACGTGGCGTCGTAT 
SEQ ID NO 21 (HCC137) 

ATGTTGGGTAAGGTCATCGATACCCTTACATGCGGCTTCGCCGACCTCGTGGGGTACA 

TTCCGCTCGTCGGCGCCCCCCTAGGGGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCG 

GGTTCTGGAGGACGGCGTGAACTATGCAACAGGGAATTTGCCCGGTTGCTCTTTCTCT 

ATCTTCCTCTTGGCTTTGCTGTCCTGTCTGACCGTTCCAGCTTCCGGTTATGAAGTGCG 

CAACGTGTCCGGGATGTACCATGTCACGAACGACTGCTCC AACTCAAGCATTGTGTAT 

GAGGCAGCGGACATGATCATGCACACCCCCGGGTGCGTGCCCTGCGTTGGGGAGAAC 

AACTCTTCCCGCTGCTGGGTAGCGCTCACCCCC'ACGCTCGCAGCTAGGAACGCCAGCG 

TCCCCACCACGACAATACGACGCCACGTCGATTGCCAGCTGTTCACC ATCTCGCCTCG 

CCGGCATGAGACGGTGCAGGACTGCAATTGCTCAATCTATCCCGGCCACATAACGGGT 

CACCGTATGGCTTGGGATATGATGATGAACTGGTCGCCTACAACGGCCCTGGTGGTAT 

CGCAGCTGCTCGGGATCCCACAAGCTGTCGTGGAGATGGTGGCGGGGGCCCATTGGGG 

AGTCCTGGCGGGTCTCGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTTTTGATTG 

TGATGCTACTCTTTGCTCCCTAATAG 

SEQ ID NO 23 (HCCI38) 

ATGTTGGGTAAGGTCATCG ATACCCTTACATGCGGCTTCGCCG ACCTCGTGGGGTACA 

TTCCGCTCGTCGGCGCCCCCCTAGGGGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCG 

GGTTCTGGAGGAGGGCGTGAACTATGCAACAGGGAATTTGeCCGGTTGCTCTTTCTCT 

ATeTTCCTCTTGGGTTTGCTGTCCTGTCTGACCGTTCCAGCTTCCGCTTATGAAGTGCG 

CAACGTGTCCGGGATGTACCATGTCACGAACGACTGCTCCAACTCAAGCATTGTGTAT 

GAGGCAGCGGACATGATCATGCACACCCCCGGGTGCGTGCCCTGCGTTCGGGAGAAC 

AACTCTTCCCGCTGCTGGGTAGCGCTCACCCCCACGCTCGCAGCTAGGAACGCCAGCG 

TCCCCACCACG AC A ATACGACGCCACGTCGATTCCCAGCTGTTC A CC ATCTCGCCTCG 

CCGGCATGAGACGGTGCAGGACTGCAATTGCTCAATCTATCCCGGCCACATAACGGGT 

C A C C GT ATG G CTTG GG AT ATG ATG ATG A A CTG G T A A 

TAG 

SEQ ID NO 25 (HCCI39) 

ATGTTGGGTAAGGTCATCG ATACCCTTACATGCGGCTTCGCCGACCTCGTGGGGTACA 
TTCCGCTCGTCGGCGCCCCCCTAGGGGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCG 
GGTTCTGGAGGACGGCGTGAACTATGCAACAGGGAATTTGCCCGGTTGCTCTTTCTCT 
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ATCTTCCTCTTGGCTTTGCTGTCCTGTCTGACCGTTCCAG.CTTCCGCTTATGAAGTGCG 
CAACGTGTCCGGGATGTACCATGTCACGAACGACTGCTCCAACTCAAGCATTGTGTAT 
GAGGCAGCG6ACATGATCATGCACACCCCCGGGTGCGTGCCCTGCGTTCGGGAGAAC 
AACTCTTCCCGCTGCTGGGTAGCGCTCACCCCCACGCTCGCAGCTAGGAACGCCAGCG 
TCCCCACCACGACAATACGACGCCACGTCGATTCCCAGCTGTTCACCATCTCGCCTCG 
CCGGCATGAGACGGTGCAGGACTGCAATTGCTCAATCTATCCCGGCCACATAAGGGGT 
CACCGTATGGCTTGGGATATGATGATG AACTGGTCGCCTACAACGGCCCTGGTGGTAT 
CGCAGCTGCTCCGGATCCTCTAATAG 

SEQ'ID NO 27 (HCCI4-0) 

ATGTTGGGTAAGGTCATCGATACCCTTACATGCGGCTTCGCCGACCTCGTGGGGTACA 
TTCCGCTCGTCGGCGCCCCCCTAGGGGGCGCTGCCAGGGCGCTGGCGCATGGCGTCGG 
GGTTCTGGAGGACGGCGTG AACTATGCAACAGGG AATTTGCCCGGTTGCTCTTTCTCT 
ATCTTCCTCTTGGCTTTGCTGTCCTGTCTGACCGTTCCAGCTTCCGCTTATGAAGTGCG 
CAACGTGTCCGGGATGTACCATGTCACGAACGACTGCTCCAACTCAAGCATTGTGTAT 
GAGGCAGCGGACATGATCATGCACACCCCCGGGTGCGTGCCCTGCGTTCGGGAGAAC 
AACTCTTCCCGCTGCTGGGTAGCGCTCACCCCCACGCTCGCAGCTAGGAACGCCAGCG 
TCCCCACCACGACAATACGACGCCACGTCGATTCCCAGCTGTTCACGATCTCGCCTCG 
CCGGCATGAGACGGTGCAGGACTGCAATTGCTCAATCTATCCCGGCCACATAACGGGT 
CACCGTATGGCTTGGGATATGATGATGAACTGGTCGCCTACAACGGCCCTGGTGGTAT 
CGCAGCTGCTCCGGATCGTGATCGAGGGCAGACACCATCACCACCAtCACTAATAG 

SEQ ID NO 29 (HCCIG2) 

ATGGGTAAGGTCATCGATACCCTTACGTGCGGATTCGCCGATCTCATGGGGTACATCC 
CGCTCGTCGGCGCTCCCGTAGGAGGCGTCGCAAGAGCCCTTGCGCATGGCGTGAGGGC 
CCTTGAAGACGGGATAAATTTCGCAACAGGGAATTTGCCCGGTTGCTCCTTTTCTATTT 
TCCTTCTCGCTCTGTTCTCTTGCTTAATTCATCCAGCAGCTAGTCTAGAGTGGCGGAAT 
ACGTCTGGCCTCTATGTCCTTACCAACGACTGTTCGAATAGCAGTATTGTGTACGAGGC 
CGATGACGTTATTCTGCACACACCCGGCTGCATACCTTGTGTCCAGGACGGCAATACA 
TCCACGTGCTGGACCCCAGTGACACCTACAGTGGCAGTCAAGTACGTCGGAGCAACCA 
CCGCTTCGATACGCAGTCATGTGGACCTATTAGTGGGGGCGGCCACGATGTGCTCTGC 
GCTCTACGTGGGTGACATGTGTGGGGCTGTCTTCCTCGTGGGACAAGCCTTCACGTTCA 
GACCTCGTCGCCATCAAACGGTCCAGACCTGTAACTGCTCGCTGTACCCAGGCCATCT 
TTCAGGACATCGAATGG.CTTGGGATATGATGATGAACTGGTAATAG 
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SEQ ID NO 31 (HCCI63) 

ATGGGTAAGGTCATCGATACCCTAACGTGCGGATTCGCCGATCTCATGGGGTATATCC 
CGCTCGTAGGCGGCCCCATTGGGGGCGTCGCAAGGGCTCTCGCACACGGTGTGAGGGT 
CCTTGAGGACGGGGTAAACTATGCAACAGGGAATTTACCCGGTTGCTCTTTCTCTATCT 

CCTCTGGGATTTATCATGTTACCAATGATTGCGCAAACTCTTCCATAGTCTATGAGGCA 
GATAACCTGATCCTACACGGACCTGGTTGCGTGCCTTGTGTCATGACAGGTAATGTGA 
GTAGATGCTGGGTCCAAATTACCCCTACACTGTCAGCCCCGAGCCTCGGAGCAGTCAC 
GGCTCC ,C, CGGAGAGCCGTT6ACTACCTAGCGGGAGGGGCTGCCCTCTGCTCCGCG 
TTATACGTAGGAGACGCGTGTGGGGCACTATTCTTGGTAGGCCAAATGTTCACCTATA 
GGCC , CGCC AGCACGCTACGGTGCAGAACTGCAACTGTTCCATTTACAGTGGCCATGT 
■ ACC GGCCACCGGATGGCATGGGATATGATGATGAACTGGTAATAG 

SEQ ID NO 33 (HCPrlOB) 
TGGGATATGATGATGAACTGGTC 

SEQ ID NO 34 (HCPr72) 

CTATTATGGTGGTAAKGCCARCARGAGCAGGAG 
SEQ ID NO 35 (HCCL22A) 

TGGGATATG ATGATG AACTGGTCGCCTACAACGGCCCTGGTGGTATCGCAGCTGCTCC 

GGATCCCACAAGCTGTCGTGGACATGGTGGCGGGGGCCCATTGGGGAGTCCTGGCGG 

GCCTCGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTTTTGGTTGTGATGCTACTC 

TTTGCCGGCGTCGACGGGCATACCCGCGTGTCAGGAGGGGCAGCAGCCTCCGATACCA 

GGGGCCTTGTGTCCCTCTTTAGCCCCGGGTCGGCTCAGAAAATCCAGCTCGTAAACAC 

CAACGGCAGTTGGCACATCAACAGGACTGCCCTGAACTGCAACGACTCCCTCCAAAC 

AGGGTTCTTTGCCGCACTATTCTACAAACACAAATTCAACTCGTCTGGATGCCCAGAG 

CGCTrGGCCAGCTGTCGCTCCATCGACAAGTTCGCTCAGGGGTGGGGTCCCCTCACTT 

ACACTGAGCCTAACAGCTCGGACCAGAGGCCCTACTGCTGGCACTACGCGCCTCGACC 

GTGTGGTATTGTACCCGCGTCTCAGGTGTGCGGTGCAGTGTATTGCTTCACCGCGAGCC 
CTGTTGTGGTGGGGACGACCGATCGG 



rrAr _ GGTGTCCCCACGTATAACTGGGGGGCGAA 
CGACTCGGATGTGCTGATTCTGAAGAACACGGGGCCGCCGCGAGGCAACTGGTTCGGC 
GTACATGGATGAATGGCACTGGGTTCACCAAGACGTGTGGGGGCCCCCCGTGCAACA 
TCGGGGGGGCCGGCAACAACACCTTGACCTGCCCCACTGACTGTTTrCGGAAGCACCC 
CGAGGCCACCTACGCCAGATGCGGTTCTGGGCCCTGGCTGACACCTAGGTGTATGGTT 
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CATTACCCATATAGGCTCTGGCACTACCCC7GCACTGTCAACTTCACCATCTTCAAGGT 

TAGGATGTACGTGGGGGGCGTGGAGCACAGGTTCGAAGCCGCATGCAATTGGACTCG 

AGGAGAGCGTTGTGAGTTGGAGGACAGGGATAGATCAGAGCTTAGCCCGCTGCTGCTG 

TCTACAACAGAGTGGCAGATACTGCCCTGTTCCTTCACCACCCTGCC6GCCCTATCCA 

CCGGCCTGATCCACCTCCATCAGAACATCGTGGACGTGCAATACCTGTACGGTGTAGG 

GTCGGCGGTTGTCTCCCTTGTCATCAAATG'GGAGTATGTCCTGTTGCTCTTCCTTCTCCT 

GGCAG ACGCGCGCATCTGCGCCTGCTTATGGATGATGCTGCTGATAGCTC AAGCTGAG 

GCCGCCTTAG AG AACCTGGTGG7CCTCAATGCGGCGGCCGTGGCCGGGGCGCATGGC 

ACTCTTTCCTTCCTT GTGTTCT7CTGTGCTGCCTGGTACATCAAGGGCAGGCTGGTCCC 

TGGTGCGGCATACGCCTTCTATGGCGTGTGGCCGCTGCTCCTGCTTCTGCTGGCCTTAC 
CACCACG AGCTTATGCCTAGTAA 

5EQ ID NO"37 (HCCI41) 

GATCCCACAAGCTGTCGTGGACATGGTGGCGGGGGCCCATTGGGGAGTCGTGGCGGG 

CCTCGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTTTTGGTTGTGATGCTACTCT 

TTGCCGGCGTCGACGGGCATACCCGCGTGTCAGGAGGGGCAGCAGCCTCCGATACCA 

GGGGCCTTGTGTCCCTCTTTAGCCCCGGGTCGGCTCAGAAAATCCAGCTCGTAAACAC 

CAACGGCAGTTGGC ACATCAACAGGACTGCCCTGAACTGCAACG ACTCCCTCCAAAC 

AGGGTTCTTTGCCGCACTATTCTACAAAC ACAAATTGAACTCGTCTGGATGGCCAGAG 

CGCTTGGCCAGCTGTCGCTCCATCG ACAAGTTCGCTCAG GGGTGGGGTCCCCTCACTT 

ACACTGAGCCTAACAGCTCGGACCAGAGGCGCTACTGCTGGCACTACGCGCCTCGACC 

GTGTGGTATTGTACCCGCGTCTCAGGTGTGCGGTCCAGTGTATTGCTTCACCCCGAGCC 

CTGTTGTGGTGGGGACGACCGATCGGTTTGGTGTCCCCACGTATAACTGGGGGGCGAA 

CGACTCGGATGTGCTGATTCTCAACAACACGCGGCCGCCGCGAGGCAACTGGTTCGGC 

TGTACATGGATGAATGGCACTGGGTTCACCAAGACGTGTGGGGGCCCCCCGTGCAACA 

TCGGGGGGGCCGGCAACAACACCTTGACCTGCCCCACTG ACTGTT7TCGGAAGCACCC 

CGAGGCCACCTACGCCAGATGCGGTTCTGGGCCCTGGCTGACACCTAGGTGTATGGTT 

CATTACCCATATAGGCTCTGGCACTACCCCTGCACTGTCAACTTCACCATCTTCAAGGT 

TAGGATGTACGTGGGGGGCGTGGAGCACAGGTTCGAAGCCGCATGCAATTGGACTCG 

agg agagcgttgtgacttggaggAcagggatagatcagagcttagcccgctgctgctg 

TCTACAACAGAGTGGCAGAGTGGCAGAGCTTAATTAATTAG 
SEQ ID NO 39 (HCCI42) 

GATCCCACAAGCTGTCGTGGACATGGTGGCGGGGGCCCATTGGGGAGTCCTGGCGGG 
CCTCGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTTTTGGTTGTGATGCTACTCT 
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TTGCCGGCGTCGACGGGCATACCCGCGTGTCAGGAGGGGC.AGCAGCCTCCGATACCA 
GuGGCCTTGTGTCCCTCTTTAGCCCCGGGTCGGCTGAGAAAATCCAGCTCGTAAACAC 
CAACGGCAGTTGGCAC ATCAACAGGACTGCCCTG AACTGCAACGACTCCCTCC AAAC 
AGGGTTCTTTGCCGCACTATTCTACAAACACAAATTGAACTCGTCTGGATGCCC AGAG 
CGCTTGGCCAGCTGTCGCTCCATCGACAAGTTCGCTCAGGGGTGGGGTCCCCTCACTT 
ACACTGAGCCTAACAGCTCGGACCAGAGGCCCTACTGCTGGCACTACGCGCCTCGACC 
GTGTGGTATTGTACCCGCGTCTCAGGTGTGCGGTCCAGTGTATTGCTTCACCCCG AGCC 
CTGTTG , GGTGGGGACG ACCGATCGGTTTGGTGTCCCCACGTATAACTGGGGGGCGAA 
CGACTCGGATGTGCTGATTCTCAACAACACGCGGCCGCCGCGAGGCAACTGGTTCGGC 
TGTACATGGATGAATGGCACTGGGTTCACCAAGACGTGTGGGGGCCCCCCGTGCAACA 
.CGGGGGGGCCGGCAACAACACCTTGACCTGCCCCACTGACTGTTTTCGGAAGCACCC 
CGAGGCCACCTACGCCAGATGCGGTTCTGGGCCCTGGCTGACACCTAGGTGTATGGTT 
CA7TACCCATATAGGCTCTGGCACTACCCCTGCACTGTCAACTTCACCATCTTCAAGC7 
TAGGATGTACGTGGGGGGCGTGGAGCACAGGTTCGAAGCCGCATGCAATTGGAGTCG 
AGGAGAGCGTTGTGACTTGGAGGACAGGGATAGATCAGAGCTTAGCCCGCTGCTGCTG 
^TACAACAGGTGATCGAGGGCAGACACCATCACCACCATCACTAATAG 

SEQ ID NO 41 (HCCI43) 

ATGGTGGGG AACTGGGCTAAGGTTTTGGTTGTG ATGGTACTCTTTGCCGGCGTCG ACG 
GGCATACCCGCGTGTCAGGAGGGGCAGCAGCCTCCGATACCAGGGGCC7TGTGTCGCT 
CTTTAGCCCCGGGTCGGCTCAGAAAATCCAGCTCGTAAACACCAACGGCAGTTGGCAC 
ATCAACAGGACTGCCCTGAACTGCAACGACTCCCTCCAAACAGGGTTCTTTGCCGCAC 
.ATTCTACAAACAGAAATTCAACTCGTCTGGATGCCCAGAGCGCTTGGCCAGCTGTCG 
CTCCATCGACAAGTTCGCTCAGGGGTGGGGTCCCCTCACTTACACTGAGCCTAACAGC 
.CGGACCAGAGGCCCTACTGCTGGCACTACGCGCCTCGACCGTGTGGTATTGTACCCG 
CGTCTCAGGTGTGCGGTCCAGTGTATTGCTTCACCCCGAGCCCTGTTGTGGTGGGGAC 
GACCGATCGGTTTGGTGTCCCCACGTATAACTGGGGGGCGAACGACTGGGATGTGCTG 
ATTCTCAACAACACGCGGCCGCCGCGAGGCAACTGGTTCGGCTGTACATGGATGAATG 
GCACTGGGTTCACCAAGACGTGTGGGGGGCCCCCGTGCAACATCGGGGGGGCCGGCA 
ACAACACCTTGACCTGCCCCACTGACTGTTTTCGGAAGGACCCCGAGGCCACCTACGC 
CAGATGCGGTTCTGGGCCCTGGCTGACACCTAGGTGTATGGTTCATTACCCATATAGG 
CTCTGGCACTACCCCTGCACTGTCAACTTCACCATCTTCAAGGTTAGGATGTACGTGGG 
GGGCGTGGAGCACAGGrrCGAAGCCGCATGCAATTGGACTCGAGGAGAGCGTTGTGA 

CTTGGAGGACAGGGATAGATCAGAGCTTAGCCCGCTGCTGCTGTCTACAACAGAGTGG 
CAGAGCTTAATTAATTAG 
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SEQ ID NO 43 (HCCI44) 

ATGGTGGGGAACTGGGCTAAGGTTTTGGTTGTG ATGCTACTCTTTGCCGGCGTCG ACG 

"if. 

GGCA I ACCCGCGTGTCAGGAGGGGCAGCAGCCTCCGATACCAGGGGCCTTGTGTCCCT 

CTTTAGCCCCGGGTCGGCTCAGAAAATCCAGCTCGTAAACACCAACGGCAGTTGGCAC 

ATCAACAGGACTGCCCTGAACTGCAACGACTCCCTCCAAACAGGGTTCTTTGCCGCAC 

TATTCTACAAACACAAATTCAACTCGTCTGGATGCCCAGAGCGCTTGGCCAGCTGTCG 

CTCCATCGACAAGTTCGCTGAGGGGTGGGGTCCCCTCACTTACACTGAGCCTAACAGC 

TCGGACCAG AGGCCCTACTGCTGGCACTACGCGCCTCGACCGTGTGGTATTGTACCCG 

CGTCTCAGGTGTGCGGTCCAGTGTATTGCTTCACCCCGAGCCCTGTTGTGGTGGGGAC 

GACCGATCGGTTTGGTGTCCCCACGTATAACTGGGGGGCG A ACGACTCGG ATGTGCTG 

ATTCTCAACAACACGCGGCCGCCGCGAGGCAACTGGTTCGGCTGTACATGGATGAATG 

GCACTGGGTTCACCAAGACGTGTGGGGGCCCCCCGTGCAACA7CGGGGGGGCCGGCA 

ACAACACCTTGACCTGCCCCACTGACTGTTTTCGGAAGCACCCCG AGGCCACCTACGC 

C AGATGCGGTTCTGGGCCCTGGCTG AC ACCTAGGTGTATGGT7CATTACCCATATAGG 

CTCTGGCACTACCCCTGCACTGTCAACTTCACCAT.CTTCAAGGTTAGGATGTACGTGGG 

GGGCGTGGAGCACAGGTTCGAAGCCGCATGCAATTGGACTCGAGGAG AGCGTTGTGA 

CTTGGAGGACAGGGATAGATCAGAGCTTAGCCCGCTGCTGCTGTCTACAACAGGTGAT 

CGAGGGCAGACACCATCACCACCATCACTAATAG 

SEQ ID MO 45 (HCCL64) 

ATGGTGGCGGGGGCCCATTGGGGAGTCCTGGCGGGCCTCGCCTACTATTCCATGGTGG 
GGAACTGGGCTAAGGTTTTGGTTGTGATGCTACTCTTTGCCGGCGTCGACGGGCATAC 
CCGCGTGTCAGGAGGGGCAGCAGCCTCCGATACCAGGGGCCTTGTGTCCCTCTTTAGC 
CCCGGGTCGGCTCAGAAAATCCAGCTCGTAAACACCAACGGCAGTTGGCACATC AAC 
AGGACTGCCCTGAACTGCAACGACTCCCTCCAAAC AGGGTTCTTTGCCGCAGTATTCT 
ACAAACACAAATTCAACTCGTCTGGATGCCCAGAGCGCTTGGCCAGCTGTCGCTCCAT 
CGACAAGTTCGCTCAGGGGTGG.GGTCCCCTCACTTACACTGAGCCTAACAGCTCGGAC 
CAGAGGCCCTACTGCTGGCACTACGCGCCTCGACCGTGTGGTAtTGTACCCGCGTCTC 
AGGTGTGCGGTCCAGTGTATTGCTTCACCCCGAGCCCTGTTGTGGTGGGGACGACCGA 
TCGGTTTGGTGTCCCCACGTATAACTGGGGGGCG AACGACTCGGATGTGCTGATTCTC 
AACAACACGCGGCCGCCGCGAGGCAACTGGTTCGGCTGTACATGGATGAATGGCACT 
GGGTTCACCAAGACGTGTGGGGGCCCCCCGTGCAACATCGGGGGGGCCGGCAACAAC 
ACCTTGACCTGCCCCACTGACTGTTTTCGGAAGCACCCCGAGGCCACCTACGCCAGAT 
GCGGTTCTGGGCCCTGGCTGACACCTAGGTGTATGGTTCATTACCCATATAGGCTCTGG 
CACTACCCCTGCACTGTCAACTTCACCATCTTCAAGGTTAGGATGTACGTGGGGGGCG 
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7GG AGCACAGGTTCG AAGCCGCATGCAATTGGACTCGAGG AGAGCGTTGTGACTTGGA 

GGACAGGGATAGATCAGAGCTTAGCCCGCTGCTGCTGTCTACAACAGAGTGGCAGATA 

CTGCCCTGTTCCTTCACCACCCTGCCGGCCCTATCCACCGGCCTGATCCACCTCCATCA 

GAACATCGTGGACGTGCAATACCTGTACGGTGTAGGGTCGGCGGTTGTCTCCCTTGTC 

ATCAAAT.GGGAGTATGTCCTGTTGCTCTTCCTTCTCCTGGCAGACGCGCGCATCTGCGC 

CTGCTTATGGATGATGCTGCTGATAGCTCAAGCTGAGGCCGCCTTAGAGAACCTGGTG 

GTCCTCAATGCGGCGGCCGTGGCCGGGGCGCATGGCACTCTTTCCTTCCTTGTGTTCTT 

CTGTGCGCCTGGTACATCAAGGGCAGGCTGGTCCCTGGTGCGGCATACGCCTTCTAT 

GGCGTGTGGCCGCTGCTCCTGCTTCTGCTGGCCTTACeAeCACGAGCTTATGGCTAGTAA 

StQ ID NO 47 (HCCI65) 

AATTTGGGTAAGGTCATCGATACCCTCACATGCGGCTTCGCCGACCTCGTGGGGTACA 
T , ^GCTCGTCGGCGCCCCCCTAGGGGGCGCTGCC AGGGCCCTGGCGCATGGCGTCCG 
GC^CTGGAGGACGGCGTGAACTATGCAACAGGGAAT-TrGCCGGGTTGCTCTTTCTCT 
ATCMCCTCTTGGCTTTGCTGTCCTGTCTGACCGTTCCAGCTTCCGCTTATGAAGTGCG " 
CAACGTGTCCGGGATGTACCATGTCACGAAGGACTGCtCCAACTCAAGCATTGTGTAT 
GAGGCAGCGGACATGATCATGCACACCCCCGGGTGCGTGCCCTGCGTTCGGGAGAAC 
AACTCTTCCCGCTGCTGGGTAGCGCTCACCCCCACGCTCGGAGCTAGGAACGCCAGCG 
TCCCCACCACGACAATACGACGCCACGTCGATTTGCTCGTTGGGGCGGCTGCTTTCTG 
TTCCGCTATGTACGTGGGGGACCTCTGCGGATCTGTCTTCCTCGTCTCCCAGCTGTTCA 
CCATC , CGCCTCGCCGGCATGAGACGGTGCAGGACTGCAATTGCTCAATCTATCCCGG 
CCACATAACGGGTCACCGTATGGC7TGGGATATGATGATGAACTGGTCGCCTACAACG 
GCCCTGGTGGTATCGCAGCTGCTCCGGATCCCACAAGCTGTCGTGGACATGGTGGCGG 
GGGCCCATTGGGGAGTCCTGGCGGGCCTCGCCTACTATTCCATGGTGGGGAACTGGGC 
TAAGGTTTTGGTTGTGATGCTACTCTTTGCCGGCGTCGACGGGCATACCCGC 
GAGGGGCAGCAGCCTCCGATACCAGGGGCCTTGTGTCCCTC 



5CGTGTCAG 
TAGCCCCGGGTCGGC 
TCAGAAAATCCAGCTCGTAAACACCAACGGCAGTTGGCACATCAACAGGACTGCCCT 
GAACTGCAACGACTCCCTCCAAACAGGGTTCTTTGCCGCACTATTCTACAAACAGAAA 
TTCAAC , CGTCTGGATGCCCAGAGCGCTTGGCCAGCTGTCGCTCCATCG ACAAGTTCG 
CTCAGGGGTGGGGTCCCCTCACTTACACTGAGCCTAACAGCTCGG ACCAGAGGGCCTA 
CTGCTGGCACTACGCGCCTCGACCGTGTGGTATTGTACCGGGGTCTCAGGTGTGCGGT 
CCAGTGTATTGCTTCACCCCGAGCCCTGTTGTGGTGGGGACGACCGATCGGTTTGGTGT 
CCCCACGTATAACTGGGGGGCGAACGACTCGGATGTGCTGATTCTCAACAACACGCGG 
CCGCCGCGAGGCAACTGGTTCGGCTGTACATGGATGAATGGCACtGGGTTCACCAAGA 
CGTGTGGGGGCCCCCCGTGCAACATCGGGGGGGCCGGCAACAACACCTTGACCTGCC 
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CCACTG ACTGTTTTCGG AAGCACCCCGAGGCCACCTACGCCAGATGCGGTTCTGGGCC 

CTGGCTGACACCTAGGTGTATGGTTCATTACCCATATAGGCTCTGGCACTACCCCTGCA 

CTGTCAACTTCACCATCTTCAAGGTTAGGATGTACGTGGGGGGCGTGGAGCACAGGTT 

CGAAGCCGCATGCAATTGG ACTCGAGGAGAGCGTTGTGACTTGG AGGACAGGGATAG 

ATCAGAGCTTAGCCCGCTGCTGCTGTCTACAACAGAGTGGCAGATACTGCCCTGTTCC 

TTCACCACCCTGCCGGCCCTATCCACCGGCCTGATCCACCTCCATCAGAACATCGTGG 

ACGTGCAATACCTGTACGGTGTAGGGTCGGCGGTTGTCTCCCTTGTCATCAAATGGGA 

GTA^GTCCTGT , GCTCTTCCTTCTCCTGGCAG ACGCGCGCATCTGCGCCTGCTTATGGA 

TG A , GCTGCTGATAGCTCAAGCTGAGGCCGCCTTAGAGAACCTGGTGGTCCTCAATGC 

GGCGGCCGTGGCCGGGGCGCATGGCACTCTTTCCTTCCTTGTGTTCTTCTGTGCTGCCT 

ggtacatcaagggcaggctggtccctggtgcggcatacgccttctatggcgtgtggcc 
gctgc , cctgcttctgctggccttaccaccacgagcttatgcctagtaagctt 

seq id no 49 (hcci66) 

atgagcacgaatcctaaacctcaaagaaaaaccaaacgtaacaccaaccgccgccca 
caugacgtcaagttcccgggcggtggtcagatcgttggtggagtttacctgttgccgc 

GCAGGGGCCCCAGGTT GGGTGTGCGCGCGACTAGGAAGAC'TtCCGAGCGGTCGCAAC 
CTCGTGGGAGGCGACAACCTATCCCCAAGGCTCGCCGACCCGAGGGTAGGGCCTGGG 
CTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGCATGGGGTGGGCAGGATG 

gctcctg » caccccgcggctctcggcctagttggggccctacagacccccggcgtagg 
tcgcgtaatttgggtaaggtcatcgatacccttacatgcggcttcgccgacctcgtgg 
ggtacattccgctcgtcggcgcccccctagggggcgctgccagggccctggcgcatgg 
cgtccgggttctggaggacggcgtgaactatgcaacagggaatttgcccggttgctct 
ttctctatcttcctcttggctttgctgtcctgtctgaccgttccagcttccgcttat-gaa 
gtgcgcaacgtgtccgggatgtaccatgtcacgaacgactgctccaactcaagcattg 
tgtatgaggcagcggacatgatcatgcacacccccgggtgcgtgccctgcgttcggga 
gaacaactcttcccgctgctgggtagcgctcacccccacgctcgcagctaggaacgcc 
agcgtccccaccacgacaatacgacgccacgtcgatttgctcgttggggcggctgctt 

cagctg 

rTCACCATCTCGCCTCGCCGGCATGAGACGGTGCAGGACTGCAATTGCTCAATCTATC 

ATATGATGATGAACTGGTCGCCTAC 
CACAAGCTGTCGTGGACATGGTG 
GCGGGGGCCCATTGGGGAGTCCTGGCGGGCCTCGCCTACTATTCCATGGTGGGGAACT 

gggctaaggttttggttgtgatgctactctttgccggcgtcgacgggcatacccgcgt 

GTCAGGAGGGGCAGCAGCCTCCGATACCAGGGGCCTTGTGTCCCTCTTTAGCCCCGGG 



TCTGTTCCGCTATGTACGTGGGGGACCTCTGCGGATCTGTCTTCCTCGTCTCC 
2ATC 

CCGGCCACATAACGGGTCACCGTATGGCTTGGG 
AACGGCCCTGGTGGTATCGCAGCTGCTCCGGATCC 
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TCGGCTCAG AAAATCCAGCTCGTA AAC ACC AACGGCAGTTGGCACATC AAC AGG ACT 
GCCCTG AACTGCAACGACTCCCTCCAA AC AGGGTTCTTTGCCGCACTATTCTACAAAC 

acaaattcaactcgtctgg'atgcccagagcgcttggccagctgtcgctccatcgacaa 

GTTCGCTCAGGGGTGGGGTCCCCTCACTTACACTGAGCCTAACAGCTCGGACCAGAGG 
CCCTACTGCTGGCACTACGCGCCTCGACCGTGTGGTATTGTACCCGCGTCTCAGGTGT 
GCGGTCCAGTGTATTGCTTCACCCCGAGCCCTGTTGTGGTGGGGACGACCGATCGGTT 

tggtgtccccacgtataactggggggcgaacgactcggatgtgctgattctcaacaac ' 

acgcggccgccgcgaggcaactggttcggctgtacatggatgaatggcactgggttca 

ccaagacgtgtgggggccccccgtgcaacatcgggggggccggcaacaacaccttga 

CCTGCCCCACTGACTGTTTTCGGAAGCACCCCGAGGCCACCTACGCCAGATGCGGTTC 

tgggccctggctgacacctaggtgtatggttcattacccatataggctctggcactac 
ccctgcactgtcaacttcaccatcttcaaggttaggatgtacgtggggggcgtggag.c 
acaggttcgaagccgcatgcaattggactcgaggagagcgttgtgacttggaggaca 
ggg atag atcagagcttagcccgctgctgctgtctacaacag agtggcagatactgcc 
ctgttccttcaccaccctgccggccctatccaccggcctgatccacctccatcagaac 
atcgtggacgtgcaatacctgtacggtgtagggtcggcggttgtctcccttgtcatca 
aatgggagtatgtcctgttgctcttccttctcctgggagacgcgcgcatctgcgcctgc 
ttatggatg atgctgctgatagctcaag.ctgaggccgccttagagaacctggtggtcc 

TCAATGCGGCGGCCGTGGCCGGGGCGCATGGCACTCTTTCCTTeCTTGTGTTCTTCTGT 
GCTGCCTGGTACATCAAGGGCAGGCTGGTCCCTGGTGCGGCATACGCCTT CTATGGCG 
TGTGGCCGCTGCTCCTGCTTCTGCTGGCCTTACCACCACGAGCTTATGCCTAGTAA 
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Figure 22 



OD nirasurcd at 450 imi 
construct 



Fraction volume dilution 3.9 40 62 63 

Type Type Type Typi 



lh lb 





START 23 ml 1/20 
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Fraction volume dilution 



Figure 24 



OD measured at 450 nm 
construct 

39 40 62 63 

Type Type Type Type 

1b 1b 3a . 5a 



250 1/200 0 072 0.130 0.096 0 051 

„ 0-109 0.293 0.084 0 052 

0-279 0.249 0.172 0 052 

0.151 0.297 0 054 

0.080 0.266 0.438 0.056 



^ 0.093 0.151 



3 1.649 0.722 0 065 

3 3 2.528 0.889 



26 °/ 251 0.457 0048 
27 

29 ; 3 3 . 2.345 
30 

t\ 0.263 
32 



33 0.103 

„ 0.045 

^ 0.043 
36 

37 

38 

39 

40 



3 3 2.849 2 580 

2.22/ 1.921 1.424 1,333 

0.415 0.356 0 162 

0.071 0.172 0.154 0.064 

0.054 0.096 0.057 

0.045 0.044 0.051 

0.047 0.045 0.046 

0.045 0.045 0.049 0.040 

0.045 0.047 0.046 0.048 

0.046 0.048 0.047 0.057 

0.045 0.048 0.050 0.057 

0.046 0.049 0.048 0.049 
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FIGURE 25 
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Western Blot Analysis with anti-El mouse monoclonal 5E1A10 
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A: NON - REDUCED 

E2 - CONTAMINANTS (AGGREGATES) 




20 . FRACTIONS 

FIGURE 31 
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FIGURE 32 



FIGURE 33: 

SILVER STAIN OF PURIFIED E2 
M J 2 
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Tcual Area above baseline = 0.796522 nil*AU 

Total area in evaluated peaks = 0.796521 ml*AU 

Ratio peak area / total area = 0.999999 

Total peak duration = 2.6135S3 ml 
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Figure 38 

Relative Map Positions of 
anti-E2 monoclonal antibodies 
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Figure 39 



z 
o 

EE 

a. 

LLi 
CL 

o 

-J 

m 
> 

LLi 

csi u- 

LU q 

in «3 

> 2 
^ & 

y- eo 
O 

h- 
z 

I 

LU 
EC 

I- 
< 

az 
< 



> 
O 
X 



CM 
Hi 



■<3- 

1 

> 

o 

• > 

cv 
LU 



CtJ to 

Q O 



nujro 
ntui.00 

nrficro 



nun 
nuu*o 



i 

cd co 
Q O 

JSC -r- 



o 

CO 



in 

-St- 




- o a, 



f 5 



o 

CO 



1 

a> 



Lp; 

CO 



Figure 4^0 



57/59 
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Figure 44-A 
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Figure 44B 
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Figure 4-6 
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